Stuck on a three word street name regex
Lie Ryan
lie.1296 at gmail.com
Thu Jan 28 09:27:59 EST 2010
On 01/28/10 11:28, Brian D wrote:
> I've tackled this kind of problem before by looping through a patterns
> dictionary, but there must be a smarter approach.
>
> Two addresses. Note that the first has incorrectly transposed the
> direction and street name. The second has an extra space in it before
> the street type. Clearly done by someone who didn't know how to
> concatenate properly -- or didn't care.
>
> 1000 RAMPART S ST
>
> 100 JOHN CHURCHILL CHASE ST
>
> I want to parse the elements into an array of values that can be
> inserted into new database fields.
>
> Anyone who loves solving these kinds of puzzles care to relieve my
> frazzled brain?
>
> The pattern I'm using doesn't keep the "CHASE" with the "JOHN
> CHURCHILL":
How does the following perform?
pat =
re.compile(r'(?P<streetnum>\d+)\s+(?P<streetname>[A-Z\s]+)\s+(?P<streetdir>N|S|W|E|)\s+(?P<streettype>ST|RD|AVE?|)$')
or more legibly:
pat = re.compile(
r'''
(?P<streetnum> \d+ ) #M series of digits
\s+
(?P<streetname> [A-Z\s]+ ) #M one-or-more word
\s+
(?P<streetdir> S?E|SW?|N?W|NE?| ) #O direction or nothing
\s+
(?P<streettype> ST|RD|AVE? ) #M street type
$ #M END
''', re.VERBOSE)
More information about the Python-list
mailing list