How to Secure Your Website Part I: Communication

| 0 Comments | 0 TrackBacks
| | | |

First published: 16th of Dec 2013 for Orbit Media Studios

Security is about risk management. Online, security is about reducing the risk of exposing information to the general Internet.

Consider the two actions occurring on any device connected to the Internet:

  • Communication
  • Storage

Communication

Communication is the heart of the Internet. The standard Internet protocol suite, known as TCP/IP (Transmission Control Protocol and Internet Protocol), is the basis for a collection of additional protocols designed to interconnect computer systems across the world in different ways. For example:

  • Domain Name - DNS (Domain Names System)
  • Email - SMTP (Simple Mail Transfer Protocol)
  • Web - HTTP (Hypertext Transfer Protocol)

Unfortunately, in the initial designs of the Internet, preventing unauthorized access to data while in transit and the verification of the communicating parties were not primary concerns. As a result, many of the protocols that use TCP/IP do not incorporate encryption or other security mechanisms by default.

The consequence is that anyone can "listen in" (not just the NSA) as data is transmitted across the Internet. That is, none of the protocols in the sample list employ any kind of encoding that restricts access to the data as it travels from one system to another.

HTTP - the protocol of the web - does, however, have a solution to this problem. SSL (Secure Sockets Layer) establishes a process to incorporate cryptographic methods that identify the parties in communication and establish a secure method of data transmission over the web (HTTPS).

Note: Today SSL's successor is TLS (Transport Layer Security), but it is still commonly referred to as SSL (or more accurately SSL/TLS).

Since the initial phase of establishing a SSL/TLS connection incorporates intense mathematical calculations, implementation in the past had been limited to specific webpages (an e-commerce site's checkout page, for example). However, today the trend is to implement as broadly as possible.

  • Popular sites, such as Google or Facebook, will conduct all communication over HTTPS by default by redirecting the initial HTTP request to HTTPS.
  • Popular web browsers will attempt to connect to a website via HTTPS first by rewriting the initial HTTP request to HTTPS before attempting a connection.

Does your website need SSL/TLS? That's a risk assessment you need to make with your web developer and hosting provider. But consider:

  • The trend is to secure more data in transit, not less.
  • Your website's visitors are not just concerned about sensitive information that they are actively providing (credit card information, for example), but other information they are actively and passively providing, such as what webpage they are viewing.

Our next security post will cover the second topic: data storage. In the meantime, have a question about security and the web? Post your question in the comments section below.

Cleaning House

| 0 Comments | 0 TrackBacks
| | | |

Over on Facebook a few days ago I commented about a personal "new year" project of reorganizing (first the home office, next this website):

"Phase one of reorganizing home office desk completed. Most useless item: note to self to clean desk (near the bottom no less) Single largest source of paper: Health Insurance"
Now I think I can can add the most interesting item from the excavation:
Business card collection

Business card collection (Photo credit: pdweinstein)

A collection of business cards from contacts and interestes from a few years ago hiding among a stash of old passwords. The Thawte, O'Reilly and daemonnews cards are from contacts I had when I did more technical writing, which started out on topics of SSL and Apache. The Google card is from a recruiter I had contact with at the time (still waiting on a job Google ;-) 

I had the pleasure of working with Eddie Codel and Scott Beale on Webzine events and even "server sat" Laughing Squid's hosting setup one Labor Day weekend while Scott and crew went to Burning Man.

Ah memories...

PHP, Nagios and MySQL Replication

| 0 Comments | 0 TrackBacks
| | | |

Overview

MySQL replication is a handy way to distribue database processes across several servers. For example, a simple "master-slave" step up allows for a continuous backup of data from a primary database server, the master to a secondary backup server, the slave. But what if the slave server stops replicating for some reason? Not much of a good backup, if it fails to copy data for some undermined length of time.

The good news is that MySQL provides a simple, detailed query for checking if replication is taking place and will report errors, should they occur. The trick of course is getting notified when an issue does occur quickly. Given an existing Nagios setup for service monitoring at a PHP shop the only missing piece is some code.

The Details
First off, Nagios has the ability to supply arguments to a script as a script being invoked at the command-line. One common set of arguments for Nagios scripts are warning and critical thresholds. For example, a disk allocation script might take arguments to send a warning notification if the amount of free disk space reaches 20% and a critical notification if free space is 10% or less.

With MySQL replication one area of concern is the network. Any latency between the two servers can induce lag in synchronizing the slave server with the master server. Given this, why not pass along a threshold to our script setting checking how many seconds the secondary server is behind the primary.

For processing command line short form and long form options in PHP there is the getopt function:

        $shortopts  = "";
        $shortopts .= "w:"; // Required value for warning
        $shortopts .= "c:"; // Required value for critical

        $longopts  = array(
                // No long form options
        );

	// Parse our options with getopt
        $options = getopt( $shortopts, $longopts );

        // If slave is x second behind for warning state
        $delayWarn = $options['w'];

        // If slave is x second behind for a critical state
        $delayCritical = $options['c'];

Besides being in a critical or warning state, Nagios also has conditions for normal and unknown. Each state is associated with a status code that will be set upon completion of the script, hence the following associative array:

        // Nagios conditions we can be in
        $statuses = array( 'UNKNOWN' => '-1', 'OK' => '0', 'WARNING' => '1', 'CRITICAL' => '2' );

For the moment, we don't know what condition our replication setup is in. Nor do we have any additional information about the current state, so let's go ahead and define that as such:

        $state = 'UNKNOWN';
        $info = '';

The next step is to go ahead and connect to our slave MySQL instance and query its status using "SHOW SLAVE STATUS;"

		$db = new mysqli( $dbHost, $dbUser, $dbPasswd );

		// Prepare query statement & execute
		$query = $db->prepare( "SHOW SLAVE STATUS" )) {
		$query->execute();

The MySQL query is going to return a number of columns in a single result row. Of immediate concern is if the slave is in error state or not. For that we take a look at the columns labeled Slave_IO_Running, Slave_SQL_Running and Last_Errno.

        // If Slave_IO_Running OR Slave_SQL_Running are not Yes 
        // OR Last_Errno is not 0 we have a problem
        if (( $SlaveIORunning != 'Yes' ) OR ( $SlaveSQLRunning != 'Yes' ) 
        	OR ( $Last_Errno != '0' )) {

            	$state = 'CRITICAL';

If the slave server is not in error, then we'll go ahead and check how far behind it is, and set a warning or critical state given the earlier parameters from the beginning of the script:

        } else if (( $row['Slave_IO_Running'] == 'Yes' ) OR ( $row['Slave_SQL_Running'] == 'Yes' ) OR ( $row['Last_Errno'] == '0' )) {

        	// So far so, good, what about time delay, how behind is the slave database?
			if ( $row['Seconds_Behind_Master'] >= $delayCritical ) {

            	$state = 'CRITICAL';

            } else if ( $row['Seconds_Behind_Master'] >= $delayWarn ) {

            	$state = 'WARN';

            } else {

            	$state = 'OK';

            }

		}

Now that we have determined the state of the secondary database server, we can pass along some information for Nagios to process.

        // What to output?
        switch ( $state ) {

                case "UNKNOWN":
                        $info = 'Replication State: UNKNOWN';
                        break;

                case "OK":
                        $info = 'Replication State: OK Master Log File: ' .$MasterLogFile. ' Read Master Log Position: ' .$ReadMasterLogPos. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
                        break;

                case "WARNING":
                        $info = 'Replication State: WARNING Master Log File: ' .$MasterLogFile. ' Read Master Log Position: ' .$ReadMasterLogPos. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
                        break;

                case "CRITICAL":
                        $info = 'Replication State: CRITICAL Error: ' .$LastErrno. ': ' .$Last_Error. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
                        break;

        }

All that is left is to transfer our information to Nagios via standard out and an exit code:

        // Need to set type to integer for exit() to handle the code properly
        $status = $statuses[$state];
        settype( $status, "integer" );

        fwrite( STDOUT, $info );
        exit( $status );

Putting it all together we get something like this:

#!/usr/bin/php
<?php

	$shortopts  = "";
	$shortopts .= "w:"; // Required value for warning
	$shortopts .= "c:"; // Required value for critical

	$longopts  = array( 
		// No long form options 
	);

	$options = getopt( $shortopts, $longopts );

	// If slave is x second behind, set state as warn
	$delayWarn = $options['w'];

	// If slave is x second behind, set state as critical
	$delayCritical = $options['c'];

	// Nagios conditions we can be in
	$statuses = array( 'UNKNOWN' =----> '-1', 'OK' => '0', 'WARNING' => '1', 'CRITICAL' => '2' );
	$state = 'UNKNOWN';
	$info = '';
	
	$dbUser = 'user';
	$dbPasswd = 'password';
	$dbHost = 'localhost';

	$db = new mysqli( $dbHost, $dbUser, $dbPasswd );

	if ( mysqli_connect_errno() ) {
	
		// Well this isn't good
		$state = 'CRITICAL';
		$info = 'Cannot connect to db server';

	} else {

		// Prepare query statement & execute
		if ( $query = $db->prepare( "SHOW SLAVE STATUS" )) {

			$query->execute();

			// Bind our result columns to variables
			$query->bind_result( $SlaveIOState, $MasterHost, $MasterUser, $MasterPort, $ConnectRetry, $MasterLogFile, $ReadMasterLogPos, $RelayLogFile, $RelayLogPos, $RelayMasterLogFile, $SlaveIORunning, $SlaveSQLRunning, $ReplicateDoDB, $ReplicateIgnoreDB, $ReplicateDoTable, $ReplicateIgnoreTable, $ReplicateWildDoTable, $ReplicateWildIgnoreTable, $LastErrno, $Last_Error, $SkipCounter, $ExecMasterLogPos, $RelayLogSpace, $UntilCondition, $UntilLogFile, $UntilLogPos, $MasterSSLAllowed, $MasterSSLCAFile, $MasterSSLCAPath, $MasterSSLCert, $MasterSSLCipher, $MasterSSLKey, $SecondsBehindMaster, $MasterSSLVerifyServerCert, $LastIOErrno, $LastIOError, $LastSQLErrno, $LastSQLError );

			// Go fetch
			$query->fetch();

			// Done
			$query->close();

			// and done
			$db->close();
	
			// If Slave_IO_Running OR Slave_SQL_Running are not Yes OR Last_Errno is not 0 we have a problem
			if (( $SlaveIORunning != 'Yes' ) OR ( $SlaveSQLRunning != 'Yes' ) OR ( $LastErrno != '0' )) {
		
				$state = 'CRITICAL';	
		
			} else if (( $SlaveIORunning == 'Yes' ) OR ( $SlaveSQLRunning == 'Yes' ) OR ( $LastErrno == '0' )) {
	
				// So far so, good, what about time delay, how behind is the slave database?
	
				if ( $SecondsBehindMaster >= $delayCritical ) {
				
					$state = 'CRITICAL';
				
				} else if ( $SecondsBehindMaster >= $delayWarn ) {
				
					$state = 'WARN';
				
				} else {
	
					$state = 'OK';
		
				}
			
			}
	
	
		} else {
			
			// Well this isn't good
			$state = 'CRITICAL';
			$info = 'Cannot query db server';			
			
		}
	
		// What to output?
		switch ( $state ) {

			case "UNKNOWN":
				$info = 'Replication State: UNKNOWN';
				break;

			case "OK":
				$info = 'Replication State: OK Master Log File: ' .$MasterLogFile. ' Read Master Log Position: ' .$ReadMasterLogPos. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
				break;

			case "WARNING":
				$info = 'Replication State: WARNING Master Log File: ' .$MasterLogFile. ' Read Master Log Position: ' .$ReadMasterLogPos. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
				break;

			case "CRITICAL":
				if ( $info == '' ) {
					
					$info = 'Replication State: CRITICAL Error: ' .$LastErrno. ': ' .$LastError. ' Replication Delay (Seconds Behind Master): ' .$SecondsBehindMaster;
			
				}
			break;
			
		}
	
	}

	// Need to set type to integer for exit to handle the exit code properly
	$status = $statuses[$state];
	settype( $status, "integer" );

	fwrite( STDOUT, $info );
	exit( $status );


?>

WWDC 2012 Predictions

| 0 Comments | 0 TrackBacks
| | | |
Apple's Worldwide Developer Conference starts this week, which means it is time for everyone under the sun to make predictions about what will be announced in the conference's keynote tomorrow.


Macbook Update
First off the completely given, new Macbook Airs. Seems a given that Apple's laptop line will get an update that pushes it more inline with the trend-setting Macbook Air. In other words, we'll see the start of a consolidation where most of Apple's laptop options will be thinner, sleeker Air-like with one or perhaps two "Pro" options for the high-end users. The open question seems to be if the laptops will be getting the rumored "Retina Display" during this refresh or not. 


OS X Update
Back in February Apple previewed the next release of OS X, v10.8 (Mountain Lion). I've already noted elsewhere I hope Mountain Lion is a nod to a previous OS X release, Snow Leopard, in that much as the Leopard release introduced a host of new concepts that later got refined in Snow Leopard, Mountain Lion will see lots of optimization of the initial iOS-izing of OS X introduced in Lion. 

Regardless, Apple promised a late summer release, so WWDC will be where we learn that it's on track, will be out the door soon and look at the cool things it does.


iOS Update
Keeping to the developer conference theme and moving from one platform to the next we'll get our first public viewing of what Apple is cooking up for iOS 6. Rumors have Facebook integration being added into the OS, similar to Apple's Twitter integration along with a move away from default apps using Google-based services such as Maps.

I don't doubt Apple is working on supporting Facebook given it's hugely popular. However, I don't see them getting too wild with it. After all, the last thing Apple wants to do is give Facebook the same kind of treatment it gave Google only to see them turn around and release their own competitive mobile platform. Which of course is why Apple is rumored to be moving away from using Google's services in default apps.

Here's a crazy and wild thought, instead of suggesting Apple purchase Twitter, I'm going to suggest Apple purchase Yahoo. Yeah sure, lots of Yahoo services suck and don't really meet Apple's high standards or business needs. But look at what you would get, a whole web and data-based services infrastructure and user base for ads, photo sharing, mapping and text/voice based searching. All things iOS users need or are dependent on.  
 
Speaking of voice-based search, when is a beta release of a new software service not a beta release? When you release it to over 4 million new users and run prime-time commercials featuring A-list celebrities. Yeah, I'm talking about Siri. I know some think Siri is over-promising and under-delivering. I suppose that's true to some extent. But it is only a "beta" release, whatever that means these days.

The real question is, what improvements will Apple in introducing? Is Siri limited to just the iPhone or will it be making a jump to the iPad in iOS 6? Personally, I think Siri makes sense as an iPhone only service. It not only helps differentiate the two, but also keeps Siri where it would be most helpful, in an "on the go" environment where one isn't necessarily fully engaged in the digital moment.

But don't expect any new iOS hardware. The iOS preview will be setting the stage for new iOS devices in the fall.


Not so Given
Other hardware consolidation, but on the desktop? For a long time now, the trend in personal computing has been moving anyway from the desktop. Most people buy laptops these days (or much to Apple's preference, iPads and iPhones). So why does Apple need three distinct desktop models?

If most consumers are purchasing laptops, why have an all-in-one desktop system such as the iMac? Yes, the all-in-one has defined the Mac since 1984. But history is one thing Apple tries hard to keep from blinding them to the changing marketplace (floppy and optical drives anyone)?

Sure, some people need a machine for heavy-lifting, but Apple hasn't updated the desktop Pro line in 2 years. 

So which is it the Pro or the iMac as the odd man out?

I personally think the iMac will fade away and, per the rumors, the Pro will be getting a much needed update after a hiatus to see if demand still existed for the device. If fact, Apple did the same thing recently with the Mac mini.


Wait and See
A proper Apple TV. Certainly Apple has been working on something. One only has to look in the Isaacson bio of Steve Jobs where in Jobs says of a TV device, "it will have the simplest user interface you could imagine. I finally cracked it."

For me, the problem really isn't technology. The problem is the business. Who is going to partner with Apple on this? Comcast? They have too much to lose from a service perspective, so why be forthcoming with their content (NBC/Universal)?

In the past one could count on Disney partnering with Apple because Jobs was the largest single shareholder of Disney (thanks to Disney's purchase of Pixar). But now?

So can Apple just bypass Comcast and the like? I don't know.

One thing I will predict about an TV offering from Apple is if there is an announcement, it will be a preview of some future availability. Unlike their current devices where pre-announcing an update can hurt sales of existing models, Apple has very little to lose with a preview of new TV device, other than perhaps some small percentage of sales of the current "hobby" Apple TV. In fact, since Apple has no current TV model, pre-announcing actually gives them an advantage, it keeps the marketplace frozen as everyone waits to see the new product up close and in person.

So weird, Connecting HavenCo and Red Hat

| 0 Comments | 0 TrackBacks
| | | |

It's a bit weird to be reading about Red Hat posting $1 billion in revenue in a year for the first time or this Ars article by James Grimmelmann about HavenCo since, to me personally that's part of my past.

See, as Grimmelmann notes, HavenCo's chairman of the board was Sameer Parekh whom I worked with/for at a different internet security company, C2Net Software. Almost everything Grimmelmann writes about I remember first-hand. I even remember reading the Wired articles he references (and how could I forget Neil Stephenson's Cryptonomicon, it's still one of my favorite novels).

Around the same time, Steven Levy wrote the non-fiction book Crypto, which tells part of the history of securing communications and modern computing networks; from Whitfield Diffie and the initial concerns of privacy to Netscape and the creation of SSL.

Alas, Levy's book is already 10 years old. While it covers the basis for the cryptography that powers today's Internet, it doesn't necessarily tell the whole story. Parts of the story that are missing, such as the short comings of SSL and its open standard successor, TLS, the adoption of "virtual private networks", that allow the use of primarily public networks, such as the Internet, to connect remote points securely, as if part of a central private network or that much of today's emails remain in "plaintext", despite the availability of encryption methods such as PGP, is missing.

Most of what happens on today's Internet every moment, took root around the same time of Levy's work, 1999-2001, when I was right there working for C2Net with its own vision on how to secure everyday communications on the "Information Superhighway".

And what happened to C2Net? Well it was sold, to......Red Hat of which I become an employee of (and then ex-employee of).

So yeah, I have this odd, I remember that (HavenCo) and oh, good for them (Red Hat). Then I think wow, I wasn't just a part of the some pioneering companies "back in the day", but also witnessed some completely cutting edge stuff that's only now being understood by the world at large.

So weird.

Chicago Open Data at Work

| 0 Comments | 0 TrackBacks
| | | |

A few years ago Blagica Bottigliero started the website Gals' Guide as an online forum for young women moving out on their own an into the "big city". Recently, I've been working with her on taking the site to the next step; building a web application utilizing the growing sets of data about life in Chicago.[1]

The Gals' Guide Map App is designed to combine different datasets about the city's various neighborhoods into one, assisting one in finding right place to live.

The web app is, somewhere between alpha and beta stages, not ready for general use or even rigorously browser tested, but ready for feedback. To that end, we've started showing the app to our various networks to gather feedback as it moves towards a general, full public release.

galsguide.png

The map and features therein have been influenced by other map mashups out there, such as the recent work done by the Chicago Tribune's News Applications team.[2]

Currently it incorporates data from the U.S. Census, the City of Chicago and Groupon. But, that's just the tip of the iceberg. There are plenty of other datasets about the city from sources such as county and state, Everyblock, Yelp, Grubhub and others.

Go, check it out and leave some feedback.



[1] This is also the next logical step for me from coding up PHP classes for the CTA's API and the City of Chicago's open data portal I started working on back in July.

[2] The team has a blog which includes a nice series of post on their work, I recommend taking a look.

The Power of mod_proxy

| 0 Comments | 0 TrackBacks
| | | |

The Power of mod_proxy: An Introduction to Proxy Servers and Load Balancers with Apache HTTP Server
Presentations slides from my mod_proxy presentation at ApacheCon NA 2011 earlier this month. In addition to this slideshow, the presentation can be downloaded in PPT and MP3



Adding SQL Server Support in PHP on Linux

| 0 Comments | 0 TrackBacks
| | | |

Back in July I outlined a method for establishing a SSH tunnel between Linux and Windows machines. The goal of the connection was to enable a PHP script on a front-end Linux web server access to information stored on the back-end private Windows server running SQL Server.

What I didn't mention at the time was how I enabled PHP support for Microsoft's SQL Server.

The most common deployments of PHP on Linux include support for MySQL or Postgres, depending largely on other factors such has the organization's preference, experience and requirements. Since PHP can be deployed on Windows, there is support for Microsoft's SQL Server. Such support is nontrivial to enable in PHP on Linux. It is however possible:

To enabled SQL Server support in PHP on Linux, the PHP extension that provides said support requires the FreeTDS library to build against. FreeTDS is an open source implementation of C libraries originally marketed by Sybase and Microsoft to enable access to their database servers.

Downloading the source code, building and installing FreeTDS is straightforward:


$ wget \
ftp://ftp.ibiblio.org/pub/Linux/ALPHA/freetds/stable/freetds-stable.tgz
$ gunzip freetds-stable.tgz
$ tar xf freetds-stable
$ cd freetds
$ ./configure
$ make
$ make install

The next step is to build the PHP source code against the FreeTDS libraries to include SQL Server support. This can be done one of two ways; build PHP from scratch or build the specific PHP extension. Since I was working on a server with a preexisting install of PHP, I opted for door number two:

Locate or download the source code for the preexisting version of PHP. Next, copy the mssql extension source code from the PHP source code into a separate php_mssql directory:


$ cp ext/mssql/config.m4 ~/src/php_mssql
$ cp ext/mssql/php_mssql.c ~/src/php_mssql
$ cp ext/mssql/php_mssql.h ~/src/php_mssql

Now build the source code, pointing it to where FreeTDS has been installed:


$ phpize
$ ./configure --with-mssql=/usr/local/freetds
$ make

There should now be a mssql.so file in ~/src/php_mssql/modules/ that can be copied into the existing PHP install. Once copied the last remaining steps are to enable the extension by modify the php.ini file and restarting the Apache HTTP Server.

Additional Information can be found here: Connecting PHP on Linux to MSQL on Windows

Accessing Chicago, Cook and Illinois Open Data via PHP

| 0 Comments | 0 TrackBacks
| | | |

In Accessing the CTA's API with PHP I outlined a class file I created1 for querying the Chicago Transit Authority's three web-based application programming interfaces (APIs). However, that isn't the only open data development I've been working on recently, I've also been working on a class file for accessing the City of Chicago's Open Data Portal.

The City of Chicago's Open Data Portal is driven by a web application developed by Socrata. Socrata's platform provides a number of web-based API methods for retrieving published datasets2. class.windy.php provides the definition for a PHP object that wraps around Socrata's API providing PHP written apps access in turn to the city's open data.


Installation and Instantiation
The first step is to download the class.windy.php file from GitHub and save it in a location that the PHP application can access.

The next step is to include the file using the include function in the PHP application itself:

	// Load our class file and create our object.
	include_once( 'class.windy.php' );

Once the class file has been loaded, the next step is to instantiate the class:
	// Create an object to get city data
	$chicago = new windy( 'city' );

Note that initialization of of the new object requires providing the keyword 'city'. Since the City of Chicago, Cook County and the State of Illinois all currently use Socrata's platform for sharing government data, class.windy.php supports accessing all three data portals - more on accessing county and state data later.

Extra parameters can be provided during the construction of the object for additional customization. For example, by default, this class will request data in JSON form. But, Socrata does support other data formats3 so the PHP developer can have a bit of a choice as to how to handle and process the raw data, if desired.


Chicago Data
Let's say for example that one is looking for information about Chicago's roughly 200 neighborhoods. More specificity information about the rough boundaries around Chicago neighborhoods. Using the getViews4 method that class.windy.php provides, one can query for any dataset where the description might include the keywords 'neighborhood boundaries' and are tagged as KML:

	// Let's find any views that describe themselves as about 
	// Chicago's neighborhood boundaries and are tagged with 
	// KML data
	$views = $chicago->getViews( 
		'', '', 
		'neighborhood boundaries', 'kml', 
		'', 'false', '', '' 
	);
	
	echo "Here are the views with a description including 
	     'neighborhood boundaries' and are tagged as KML:\n";
	foreach ( $views as $view ) {
	
		echo "View ID: ".$view->id. " is named " 
		     .$view->name. " and is described as a " 
		     .$view->description. "\n\n";
		
	}

Depending on what datasets are available the output of this query may include the follow result:
View ID: buma-fjbv is named Boundaries - Neighborhoods - KML and is described as a KML file of neighborhood boundaries in Chicago. To view or use these files, special GIS software, such as Google Earth, is required.

Perhaps, using Google Maps, one wishes to create an interactive map of Chicago neighborhoods? Or better yet a closer look at the boundaries of a specific area in the city? Retrieval of the KML file can be done via the getFileByViewID function:
		// With our foreknowledge of datasets the file with 
		// view id buma-fjbv looks interesting, let's get 
		// that file
		if ( $view->id == 'buma-fjbv' ) {
					
			$file = $chicago->getFileByViewID( 
				$view->blobId, $view->id 
			);

			// Since KML is an XML notation for 
			// expressing geographic annotation, 
			// let's use SimpleXML to parse this data
			// Note: SimpleXML requires the libxml 
			// PHP extension
			$xml = simplexml_load_string( $file );

The above chunk of code nested in the previous example code's foreach loop compared the $view->id value of each dataset looking for a match to the specific KML file of Chicago neighborhoods. Once found, the getFileByViewID method, given the file id, found in $view->blobID and the known view id sends a query for and retrieval of the file from data portal.

Once the file has been loaded, the KML file, a type of XML format, can be parsed using the simplexml_load_string function which returns a SimpleXML object and can be traversed as needed:
			// Ok, now let's find out what the 
			// boundaries are for Albany Park
			foreach ( 
			     $xml->Document->Folder->Placemark 
				as $hood 
			) {
			
				if ( preg_match( 
					'/Albany Park/', 
					$hood->description 
				)) {
				
					echo "Here are 
						Albany Park's 
						boundaries: " 
						.$hood->
						MultiGeometry->
						Polygon->
						outerBoundaryIs->
						LinearRing->
						coordinates. "\n";
				
				}
			
			}

Thus the output of our search for the boundaries of Chicago's Albany Park neighborhood results in the the following geographic data points:
Here are Albany Park's boundaries:  
-87.724211,41.975689,0.000000
-87.724081,41.975576,0.000000 -87.723951,41.975578,0.000000
-87.723773,41.975580,0.000000 -87.723544,41.975583,0.000000
-87.723355,41.975585,0.000000 -87.722786,41.975592,0.000000
-87.721267,41.975605,0.000000 -87.720888,41.975608,0.000000
-87.720859,41.974088,0.000000 -87.720825,41.974085,0.000000
-87.720636,41.974088,0.000000 -87.720561,41.974079,0.000000
-87.720501,41.974061,0.000000 -87.720370,41.974010,0.000000
-87.720264,41.973981,0.000000 -87.720135,41.973962,0.000000
-87.719949,41.973890,0.000000 -87.719723,41.973819,0.000000
-87.719616,41.973775,0.000000 -87.719252,41.973625,0.000000
-87.719147,41.973578,0.000000 -87.719074,41.973535,0.000000
-87.718948,41.973461,0.000000 -87.718826,41.973396,0.000000
-87.718714,41.973346,0.000000 -87.718388,41.973226,0.000000
-87.718366,41.973218,0.000000 -87.718291,41.973195,0.000000
-87.718069,41.973097,0.000000 -87.717935,41.973023,0.000000
-87.717737,41.972915,0.000000 -87.717479,41.972781,0.000000
-87.717195,41.972637,0.000000 -87.717132,41.972604,0.000000
-87.716961,41.972517,0.000000 -87.716548,41.972280,0.000000
-87.716203,41.972086,0.000000 -87.715963,41.971978,0.000000
-87.715918,41.971968,0.000000 -87.715808,41.971941,0.000000
-87.715666,41.971921,0.000000 -87.715501,41.971920,0.000000
-87.715076,41.971932,0.000000 -87.714697,41.971930,0.000000
-87.714681,41.971930,0.000000 -87.714619,41.971930,0.000000
-87.714614,41.971930,0.000000 -87.714547,41.971937,0.000000
-87.714479,41.971942,0.000000 -87.714435,41.971949,0.000000
-87.714385,41.971952,0.000000 -87.714305,41.971954,0.000000
-87.714227,41.971960,0.000000 -87.714178,41.971965,0.000000
-87.714026,41.971969,0.000000 -87.713652,41.972033,0.000000
-87.713466,41.972078,0.000000 -87.713254,41.972129,0.000000
-87.712834,41.972277,0.000000  ...


Cook and Illinois Data Portals

As previously mentioned the Cook County and the State of Illinois also currently use Socrata's platform for sharing government data. For the moment that makes supporting city, county and state open data with one class file a trivial manner:

// Create an object to get county data
$cook = new windy( 'county' );

// Let's find any views that describe the boundaries of the 
// county forest preserves
$views = $cook->getViews( 
	'', '', '', 'county park boundaries', '', 'false', '', '' 
);
	
echo "Here are the views with a description including 
	'county park boundaries':\n";
foreach ( $views as $view ) {
	
	echo "View ID: ".$view->id. " is named " 
		.$view->name. " and is described as a " 
		.$view->description. "\n\n";
	
}


In Review
class.widy.php is a single PHP class file that primarily provides access to City of Chicago's data portal. The class implements functions for accessing all API methods and by default returns an object that a PHP developer can use to incorporate information about the City of Chicago into their PHP based application.

Since Cook County and the State of Illinois have also adopted Socrata's data platform, support for these additional data portals using Socrata's Open Data Platform has been included as a secondary feature of the class file.




1 And which can be found on GtHub

2 Known as SODA, a standards-based, RESTful application programming interface.

3 Such as XML, RDF, XLS and XLSX (Execl), CSV, TXT and PDF.

4 getViews is one of a collection of methods Socrata refers to as ViewsService. This collection of API calls provide for the retrieval and manipulation of datasets and metadata about datasets.   

Accessing the CTA's API with PHP

| 0 Comments | 0 TrackBacks
| | | |

Overview
Last month the City of Chicago arranged for a Open Data Hackaton in which a collection of programmers gathered together to develop and write programs that utilize a new resource, open access to city information.

For my part, I spent the data writing a PHP class file that wraps around the Chicago Transit Authority's web-based application programming interface, enabling access to CTA bus, rail and service information for PHP driven applications. As I've noted in the README file, "this class brings all three APIs together into one object with related methods."

The following in a quick rundown of how to incorporate this new class file into a working PHP application.


Installation
The first step is to download the class.cta.php file from GitHub and save it in a location that the PHP application has read access.

The next step is to include the file using the include (or similar require) function in the PHP application itself:

// Load the class file in our current directory
include_once( 'class.cta.php' );

Once the class file has been loaded, the next step is to instantiate the class:

$transit = new CTA ( 
	'YOUR-TRAIN-API_KEY_HERE', 
	'YOUR-BUS-API-KEY-HERE', false 
);

Notice that initialization of transit includes providing two API keys. API Keys can be requested from the CTA. For an API Key for Train Tracker, use the Train Tracker API Application form. For Bus Tracker, first sign into Bus Tracker, then request an Developer Key under "My Account".1

If no valid API keys are provided the only methods that will return valid information are the Customer Alert functions for system status information. Specificity the two functions statusRoutes and statusAlerts. This is because the Customer Alert API does not require an API key for access.


Execution
To invoke a method simply use the object and related function, providing any additional information as parameters, if required. For example, to get information about all of the bus stops the east-bound route 81 bus makes:

// Get an array result of all stops for an east-bound 81 bus.
$EastBoundStops = $transit->busGetStops( '81', 'East Bound' ));

All methods return an array which can be accessed to retrieve desired information. PHP's print_r or var_dump functions provide insight into all information returned by a specific function:

echo '<pre>';
print_r( $transit->busGetStops( '81', 'East Bound' ));
echo '</pre>';

The output will look something akin to this:

SimpleXMLElement Object
	(
	    [stop] => Array
	        (
	            [0] => SimpleXMLElement Object
	                (
	                    [stpid] => 3751
	                    [stpnm] => 2900 W Lawrence
	                    [lat] => 41.968500785328
	                    [lon] => -87.701137661934
	                )
...
	            [49] => SimpleXMLElement Object
	                (
 	                   [stpid] => 3725
	                    [stpnm] => Milwaukee & Higgins
	                    [lat] => 41.969027266773
	                    [lon] => -87.761798501015
	                )

 	       )

	)

In order to generate the following output listing the location of the Lawrence & Kimball stop:

Lawrence & Kimball (Brown Line)
At 41.968405060961 North and -87.713229060173 West

The following PHP code will provide the latitude and longitude of the Kimball stop, which is also a transfer point to the El's Brown Line:

$EastBoundStops = $transit->busGetStops( '81', 'East Bound' );
foreach( $EastBoundStops as $stop ) {

     if ( preg_match( '/kimball/i', $stop->stpnm )) {
		
          echo $stop->stpnm;
          echo 'At ' .$stop->lat. 'North and ' .$stop->lon. ' West';
		
     }
	
}

Notice that while the list of stops is provided in an array, each element in the array is a SimpleXMLElement object, thus the use of the object syntax for accessing each element.

The train function will allow for the determination of rail information, for example when the next Brown line train will be leaving the Kimball stop. However, while the previous example included a stop id for the route 81 bus at Kimball, the stop id is unique to the route 81 bus and does not translate to the stop id of the Brown line El at Kimball. Therefore, the first step is to locate the relevant GTFS2 data for the Kimball station:

/*	Per the CTA's website, El routes are identified as follows:

	Red = Red Line
	Blue = Blue Line
	Brn = Brown Line
	G = Green Line
	Org = Orange Line
	P = Purple Line
	Pink = Pink Line
	Y = Yellow Line
		
	Which means our Brown line is 'brn' 
*/
$brownStops = $transit->train( '', '', '', 'brn' );
foreach( $brownStops as $stop ) {

	if ( preg_match( '/kimball/i', $stop->staNm )) {
		
		echo "$stop->staNm train is destined for $stop->stpDe ";
		echo "Scheduled to arrive/depart at $stop->arrT";
		
	}
	
}

Which provides output similar to the following:

Kimball train is destined for Service toward Loop
Scheduled to arrive/depart at 20110821 17:17:01

One should note that the Brown line stop at Kimball is the northern end point for the Brown line, which means any trains leaving the station will only be bound in one direction, south, toward the Loop. If the string comparison is changed to 'irving' for the Irving Park station, the output changes to something similar, with trains running in both directions:

Irving Park train is destined for Service toward Kimball
Scheduled to arrive/depart at 20110821 17:19:34

Irving Park train is destined for Service toward Kimball
Scheduled to arrive/depart at 20110821 17:23:13

Irving Park train is destined for Service toward Loop 
Scheduled to arrive/depart at 20110821 17:19:44

Irving Park train is destined for Service toward Kimball 
Scheduled to arrive/depart at 20110821 17:32:19 

In Review
class.cta.php is a single PHP class file that provides access to all three CTA APIs for Bus, Train and Service information. The class implement functions for access to all API methods and returns an array of SimpleXMLElement objects that a PHP developer can use to incorporate real-time information about Chicago's public transit system.

Additional information about the CTA's APIs, including terms of use and how to request API Keys, can be found on the CTA's Developer page.




1 Why two different API Keys, one for train and one for bus information? Due to the evolution of the CTA's API interfaces, there are three distinct APIs, one for Bus, Train and Customer Alerts information. As a result there are three distinct URI endpoints and two distinct API keys.

2 The CTA provides its data based on the Google Transit Feed Specification (GTFS) which is becoming a common format for public transportation agencies to publishing schedules and associated geographic information. The CTA generates and distributes, about once a week, an up-to-date zip compressed collection of files that includes basic agency information, route, transfer and stop locations, and other related service information. Note that ids 0-29999 are bus stops, ids 30000-39999 are train stops and 40000-49999 train stations (parent stops).

Monthly Archives