matching a street address with regular expressions

Karthik Gurusamy kar1107 at gmail.com
Wed Oct 10 21:21:16 CEST 2007


On Oct 10, 10:02 am, "Shawn Milochik" <Sh... at Milochik.com> wrote:
> On 10/4/07, Ricardo Aráoz <ricar... at gmail.com> wrote:
>
>
>
> > Christopher Spears wrote:
> > > One of the exercises in Core Python Programming is to
> > > create a regular expression that will match a street
> > > address.  Here is one of my attempts.
>
> > >>>> street =  "1180 Bordeaux Drive"
> > >>>> patt = "\d+ \w+"
> > >>>> import re
> > >>>> m = re.match(patt, street)
> > >>>> if m is not None: m.group()
> > > ...
> > > '1180 Bordeaux'
>
> > > Obviously, I can just create a pattern "\d+ \w+ \w+".
> > > However, the pattern would be useless if I had a
> > > street name like 3120 De la Cruz Boulevard.  Any
> > > hints?
>
> Also, that pattern can be easily modified to have any number of words
> at the end:
> patt = "\d+ (\w+){1,}"
> This would take care of 3120 De la Cruz Boulevard.

\w doesn't take care of white-space. Following will work.

patt = r"\d+ (\w+\s*){1,}"


BTW {1,} is same as +. So
patt = r"\d+ (\w+\s*)+"
will work as well.

Note that using raw-string for re pattern is safer in most uses.

Karthik




More information about the Python-list mailing list