matching a street address with regular expressions

John Machin sjmachin at lexicon.net
Fri Oct 12 15:19:18 CEST 2007


On Oct 12, 4:07 pm, Paul McGuire <pt... at austin.rr.com> wrote:
> On Oct 11, 11:50 pm, John Nagle <na... at animats.com> wrote:
>
>
>
> >     If anyone has a first-rate address parser in Python that will cover
> > most of the developed world, I'd like to talk to them.
>
> >                                 John Nagle
> >                                 SiteTruth
>
> The pyparsing examples page includes a street address parser (http://
> pyparsing.wikispaces.com/space/showimage/streetAddressParser.py) that
> will handle these test cases:
>
>     100 South Street
>     123 Main
>     221B Baker Street
>     10 Downing St
>     1600 Pennsylvania Ave
>     33 1/2 W 42nd St.
>     454 N 38 1/2
>     21A Deer Run Drive
>     256K Memory Lane
>     12-1/2 Lincoln
>     23N W Loop South
>     23 N W Loop South
>     25 Main St
>     2500 14th St
>     12 Bennet Pkwy
>     Pearl St
>     Bennet Rd and Main St
>     19th St
>
> -- Paul

"... most of the developed world" was the [very optimistic] request.
How does it go with "JAPAN 112-0001 TOKYO Bunkyo-Ku Hakusan 4-Chome 3-
2" and will it give the same result for "4-3-2 HAKUSAN BUNKYO-KU TOKYO
112-00001 JAPAN"? OK, a little exotic ... closer to "home", what about
addresses in Quebec? People often write addresses in formats that you
won't find on the postal service website, but the local postal workers
will still deliver. Rural addresses can be quaintly medieval e.g. "Lot
123, Hundred of Foughbarre" [South Australia]. Etc etc ...




More information about the Python-list mailing list