Stuck on a three word street name regex
briandenzer at gmail.com
Thu Jan 28 14:40:09 CET 2010
> > [snip]
> > Regex doesn't gain you much. I'd split the string and then fix the parts
> > as necessary:
> > >>> def parse_address(address):
> > ... parts = address.split()
> > ... if parts[-2] == "S":
> > ... parts[1 : -1] = [parts[-2]] + parts[1 : -2]
> > ... parts[1 : -1] = [" ".join(parts[1 : -1])]
> > ... return parts
> > ...
> > >>> print parse_address("1000 RAMPART S ST")
> > ['1000', 'S RAMPART', 'ST']
> > >>> print parse_address("100 JOHN CHURCHILL CHASE ST")
> > ['100', 'JOHN CHURCHILL CHASE', 'ST']
> This is a nice approach I wouldn't have thought to pursue. I've never
> seen this referencing of list elements in reverse order with negative
> values, so that certainly expands my knowledge of Python. Of course,
> I'd want to check for other directionals -- probably with a list
> check, e.g.,
> if parts[-2] in ('E', 'W', 'N', 'S'):
> Thanks for sharing your approach.
After studying this again today, I realized the ingeniousness of
reverse slicing the list (or perhaps right slicing) -- that one
doesn't have to worry about the number of words in the string.
To translate for those who may follow, the expression "parts[1 : -1]"
means gather list items from position one in the list (index position
2) to one index position before the end of the list. The value in this
is that we already know the first list element after a split() will be
the street number. The last element will be the street type.
Everything in between, no matter how many words, will be the street
name -- excepting, of course, the instances where there's a street
direction added in, as captured in example above.
A very nice solution. Thanks again!
More information about the Python-list