[Chicago] I'm lonely presenter

Nick Bennett nick at goggl.es
Tue Jan 7 04:52:13 CET 2014


@Gang and all,

Divvy Bikes is the bike rental service that just set up shop here in
Chicago sometime in the last 6 months. They installed 300 bike stations
throughout the city, and plan to install 175 more.

There is a JSON data feed on the bike stations here:
http://divvybikes.com/stations/json

If you want to view this nicely formatted, try here:
http://jsonviewer.stack.hu/#http://divvybikes.com/stations/json

This data includes latitude, longitude, station name (it's usually
descriptive of where it is), total number of bikes, how many are currently
available, and some other stuff.

If you are a Divvy Bikes member, you can go to your online account and see
a table of the stations where you picked up and dropped off bikes, along
with the duration of the ride. Presumably this is for accounting purposes.
Alex Soble created a Chrome browser extension that scrapes this table of
data from the page and enriches it with the distance between the stations.
While you more than likely did not ride straight from one station to
another, this at least gives a lower bound ("You rode more than ___
miles!"). He then takes this distance and slices and dices it with the time
and displays it very usefully with Highcharts. He also gives you a link to
download your data as CSV so you can do what you want with it.

Take a peek at his extension here, nice charts:
https://chrome.google.com/webstore/detail/divvybrags/obpfmeilmeicjimgkpekfgmaoelbbfpf

His Chrome browser extension is open source and hosted on Github:
https://github.com/alexsoble/divvybrags2

He was featured in an article about a previous related project:
http://www.dnainfo.com/chicago/20130926/south-loop/divvy-shuts-down-inventors-mileage-tracking-app

Alex found his table of pick-ups and drop-offs was over 200 rows long, and
with each row his Javascript code would make a Google Maps API call to get
the distance. His extension also doesn't cache your data between visits to
your profile, so every time he reloaded the page while developing the
extension he had to wait for it to slog through 200+ Google Maps API call,
easily 3-5 minutes.

What his extension needed was a listing of distances for every possible
pairwise combination of bike stations. With 300 stations, that's 300*299/2
or 44850 Google Maps API calls. But wait, the distance from station A to
station B by bike might be different than station B to A based on bike
routing information! 98700 API calls is a rather daunting figure when faced
with the API usage limits for free accounts (
https://developers.google.com/maps/documentation/distancematrix/#Limits).

The problem became solvable when I did a few things:

- Alex got a Google Maps API BUSINESS key from the Smart Chicago
Collaborative (http://www.smartchicagocollaborative.org/), which allowed us
to use the much more generous rate limits of the Business account

- I found that MapQuest has an open data version with a corresponding API,
and they have very relaxed rate limits for anyone who goes to the simple
trouble of creating a developer account.
http://open.mapquestapi.com/directions/#matrix

- I was using the distanceMatrix call inefficiently. I was originally
sending it pairs of stations, pair by pair, which goes to show I didn't
read the name of the function. Instead I sent it a 1xN matrix (a list) of
stations and it returned N distances; that's the distance between the first
station in the list and every station in the list including itself. By
packing in as many stations as the respective API allowed (24ish for the
Google distanceMatrix GET request (really weird that it's GET), 100 for the
MapQuest POST request), I significantly reduced the number of API calls I
had to make.

I wrapped the distanceMatrix calls for both services in functions that take
a list of stations as dictionaries that have at least 'latitude' and
'longitude' keys. They just rely on the third-party library requests (
http://docs.python-requests.org). I endeavoured to make them as readable as
I could because I imagine myself having to read the code again in 6 months:

https://github.com/tothebeat/pairwise-geo-distances/blob/master/mapquest.py
https://github.com/tothebeat/pairwise-geo-distances/blob/master/googlemaps.py

Those are just modules for the main script, that is meant to be run as a
command-line script:

https://github.com/tothebeat/pairwise-geo-distances/blob/master/station_distances.py

There's similar data for NYC and D.C. Steve Vance told me about here:

https://github.com/tothebeat/pairwise-geo-distances/tree/master/bike_stations_data

I hope that helps!



On Mon, Jan 6, 2014 at 1:23 PM, Gang Huang <doc.n.try at gmail.com> wrote:

> @nick
> That's awesome for the divvy work. I was actually trying to implement json
> from divvy to look at path prediction for fun, which needs accurate
> distance calculation, before my free time disappeared. I'll love to hear
> how you are approaching it.
> On Jan 5, 2014 5:48 PM, "Nick Bennett" <nick at goggl.es> wrote:
>
>> I'd love for an excuse to get up and talk, but I just don't know what
>> would be interesting to an audience. I'll just say a few of the things I've
>> been fiddling with, someone speak up if any of this would be interesting to
>> hear about for a few minutes.
>>
>> Recently I've been distracted from ChiPy by going to the Open Gov Hack
>> Nights (http://opengovhacknight.org/) where I've been able to help Alex
>> Soble with his Divvy Bikes-related Chrome extension Divvy Brags (
>> https://chrome.google.com/webstore/detail/divvybrags/obpfmeilmeicjimgkpekfgmaoelbbfpf)
>> by writing a Python script to get the pairwise distances between Divvy bike
>> stations from the Google Maps API using "by the bike" distances. The first
>> script I wrote actually used the MapQuest Open API:
>> https://gist.github.com/tothebeat/7783079 I complicated that script into
>> a little project to get pairwise distances from Google Maps or MapQuest
>> using the distanceMatrix request correctly:
>> https://github.com/tothebeat/pairwise-geo-distances
>>
>> I also dove into some Roadway Fatalities data, after seeing one too many
>> of those "there have been 900 deaths on the highway this year" from the
>> USHTA's FARS (enough initialisms?) which is available in DBF and SAS file
>> formats. I used a DBF-reading Python module that is never going to be on
>> the top 100 of Pypi downloads, and translated all roadway fatality data
>> provided for the years 1975 to 2012 into CSV format. All if this is on
>> Github and there's a little automatic page for it:
>> http://tothebeat.github.io/fatal-car-crashes/
>>
>> I contributed to Open Gov Hack Night in a small way by fixing two bugs on
>> the issues list of civic-json-worker (
>> https://github.com/open-city/civic-json-worker), the project that hits
>> the Github API and produces a JSON file that is then served up to power the
>> Open Gov Hack Night's project listing page (
>> http://opengovhacknight.org/projects.html). I never knew contributing to
>> a project could be so simple. It was just a few lines of code in total,
>> there was more work involved in forking and formatting my git commit
>> message nicely and sending the pull request.
>>
>> I connected with a few other people at the hack night who wanted to try
>> to take a stab at scraping one of the City of Chicago's department
>> websites, namely their Business Solicitations search page (
>> https://webapps1.cityofchicago.org/VCSearchWeb/org/cityofchicago/vcsearch/controller/solicitations/begin.do?agencyId=city).
>> One person had no prior programming experience, and another was familiar
>> but had not used Python extensively. We have a Github organization and repo
>> that is quietly developing bitrot:
>> https://github.com/bnjy-opengov/chi-solicitations-feed
>>
>> That's about as far into my project stash before I start to churn up real
>> inanity.
>>
>>
>> On Sun, Jan 5, 2014 at 5:24 PM, Brian Ray <brianhray at gmail.com> wrote:
>>
>>> I am not the only one getting up there thursday. Common folks, let's
>>> hear some topics proposals. What ya'all workin on huh? If I can do it, so
>>> can you ;)
>>>
>>>
>>>
>>> --
>>> Brian Ray
>>> @brianray
>>> (773) 669-7717
>>>
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> https://mail.python.org/mailman/listinfo/chicago
>>>
>>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> https://mail.python.org/mailman/listinfo/chicago
>>
>>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> https://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20140106/cabbcd8a/attachment-0001.html>


More information about the Chicago mailing list