Akward code using multiple regexp searches
Steven Bethard
steven.bethard at gmail.com
Fri Sep 10 02:54:47 EDT 2004
Topher Cawlfield <cawlfiel <at> uiuc.edu> writes:
> Can anyone suggest a more elegant solution?
Does this do what you want?
>>> rexp1 = re.compile(r'blah(dee)blah')
>>> rexp2 = re.compile(r'hum(dum)')
>>> for s in ['blahdeeblah', 'blah blah', 'humdum humdum']:
... result = rexp1.findall(s) or rexp2.findall(s) or [None]
... print repr(result[0])
...
'dee'
None
'dum'
The findall function returns all matches of the re in the string, or an empty
list if there were no matches. So if the first findall fails, the or-
statement will then execute the second findall, and if that one fails, the
default value None will be supplied. Note that findall returns a list of the
matches, hence why I have to extract the first element of the list at the end.
> I'm a little bit worried about doing the following in Python, since I'm
> not sure if the compiler is smart enough to avoid doing each regexp
> search twice:
>
> for line in inFile:
> if rexp1.search(line)
> something = rexp1.search(line).group(1)
> elif rexp2.search(line):
> somethingElse = rexp2.search(line).group(1)
You're right here - Python will call the method twice (and therefore search
the string twice). It has no way of knowing that these two calls to the same
method will actually return the same results. (In general, there are no
guarantees that calling a method with the same parameters will return the same
result -- for example, file.read(100))
Steve
More information about the Python-list
mailing list