Stuck on a three word street name regex

Brian D briandenzer at gmail.com
Thu Jan 28 01:28:35 CET 2010


I've tackled this kind of problem before by looping through a patterns
dictionary, but there must be a smarter approach.

Two addresses. Note that the first has incorrectly transposed the
direction and street name. The second has an extra space in it before
the street type. Clearly done by someone who didn't know how to
concatenate properly -- or didn't care.

1000 RAMPART S ST

100 JOHN CHURCHILL CHASE  ST

I want to parse the elements into an array of values that can be
inserted into new database fields.

Anyone who loves solving these kinds of puzzles care to relieve my
frazzled brain?

The pattern I'm using doesn't keep the "CHASE" with the "JOHN
CHURCHILL":

>>> p = re.compile(r'(?P<streetnum>\d+)\s(?P<streetname>[A-Z\s]*)\s(?P<streetdir>\w*)\s(?P<streettype>\w{2})$')
>>> s = '1405 RAMPART S ST'
>>> m = re.search(p, s)
>>> m
<_sre.SRE_Match object at 0x011A4440>
>>> print m.groups()
('1405', 'RAMPART', 'S', 'ST')
>>> s = '45 JOHN CHURCHILL CHASE ST'
>>> m = re.search(p, s)
>>> m
<_sre.SRE_Match object at 0x011A43E8>
>>> print m.groups()
('45', 'JOHN CHURCHILL', 'CHASE', 'ST')



More information about the Python-list mailing list