Stuck on a three word street name regex

MRAB python at mrabarnett.plus.com
Wed Jan 27 20:27:42 EST 2010


Brian D wrote:
> I've tackled this kind of problem before by looping through a patterns
> dictionary, but there must be a smarter approach.
> 
> Two addresses. Note that the first has incorrectly transposed the
> direction and street name. The second has an extra space in it before
> the street type. Clearly done by someone who didn't know how to
> concatenate properly -- or didn't care.
> 
> 1000 RAMPART S ST
> 
> 100 JOHN CHURCHILL CHASE  ST
> 
> I want to parse the elements into an array of values that can be
> inserted into new database fields.
> 
> Anyone who loves solving these kinds of puzzles care to relieve my
> frazzled brain?
> 
> The pattern I'm using doesn't keep the "CHASE" with the "JOHN
> CHURCHILL":
> 
[snip]
Regex doesn't gain you much. I'd split the string and then fix the parts
as necessary:

 >>> def parse_address(address):
...     parts = address.split()
...     if parts[-2] == "S":
...         parts[1 : -1] = [parts[-2]] + parts[1 : -2]
...     parts[1 : -1] = [" ".join(parts[1 : -1])]
...     return parts
...
 >>> print parse_address("1000 RAMPART S ST")
['1000', 'S RAMPART', 'ST']
 >>> print parse_address("100 JOHN CHURCHILL CHASE  ST")
['100', 'JOHN CHURCHILL CHASE', 'ST']



More information about the Python-list mailing list