regex alternation problem

Tim Chase python.list at tim.thechases.com
Fri Apr 17 18:05:47 EDT 2009


> s1 = "I am an american"
> 
> s2 = "I am american an "
> 
> for s in [s1, s2]:
>     print re.findall(" (am|an) ", s)
> 
> # Results:
> # ['am']
> # ['am', 'an']
> 
> -------
> 
> I want the results to be the same for each string.  What am I doing
> wrong?

In your first case, the regexp is consuming the " am " (four 
characters, two of which are spaces), leaving no leading space 
for the second one to find.  You might try using \b as a 
word-boundary:

   re.findall(r"\b(am|an)\b", s)

-tkc







More information about the Python-list mailing list