[Chicago] Address parser?

bray at sent.com bray at sent.com
Sat Feb 23 20:47:46 CET 2008


On Fri, 22 Feb 2008 11:00:29 -0600, "Michael Tobis" <mtobis at gmail.com>
said:
> This depends on whether you want to make a best effort to parse every
> address or just the ones already in a known format.
> 
> The every address case is very hard. I know someone who made a living
> for a few years doing this for Canada, which is even harder. He
> pointed out that he had a special case in his code for Avenue Road, a
> major street in Toronto.
> 

I am not sure if this will help Lucas case; although, in general, you
can get (if you pay some money) some fixed width data files from
USPS/AIS <http://www.usps.com/ncsc/addressinfo/addressinfomenu.htm> that
contains the correctly formatted address data for every US address. You
can make a fuzzy string and predetermined rules bases match then use the
address from the big database. Its fairly easy to do with Python. OTOH,
there are some commercial software packages that do this for you (just
look for CASS certification software).

I noticed that if your address data makes it successfully through CASS
and then you try to GEO-code, you will have better results.  This is
more useful when your trying to get closer accuracy. Otherwise, its hard
to know what to do when your API does not like the address, at all.

Regards,

Brian Ray







More information about the Chicago mailing list