Multiple regex match idiom
Steffen Oschatz
Steffen.Oschatz at googlemail.com
Thu May 10 06:07:36 EDT 2007
On 9 Mai, 11:00, Hrvoje Niksic <hnik... at xemacs.org> wrote:
> I often have the need to match multiple regexes against a single
> string, typically a line of input, like this:
>
> if (matchobj = re1.match(line)):
> ... re1 matched; do something with matchobj ...
> elif (matchobj = re2.match(line)):
> ... re2 matched; do something with matchobj ...
> elif (matchobj = re3.match(line)):
> ....
>
> Of course, that doesn't work as written because Python's assignments
> are statements rather than expressions. The obvious rewrite results
> in deeply nested if's:
>
> matchobj = re1.match(line)
> if matchobj:
> ... re1 matched; do something with matchobj ...
> else:
> matchobj = re2.match(line)
> if matchobj:
> ... re2 matched; do something with matchobj ...
> else:
> matchobj = re3.match(line)
> if matchobj:
> ...
>
> Normally I have nothing against nested ifs, but in this case the deep
> nesting unnecessarily complicates the code without providing
> additional value -- the logic is still exactly equivalent to the
> if/elif/elif/... shown above.
>
> There are ways to work around the problem, for example by writing a
> utility predicate that passes the match object as a side effect, but
> that feels somewhat non-standard. I'd like to know if there is a
> Python idiom that I'm missing. What would be the Pythonic way to
> write the above code?
Instead of scanning the same input over and over again with different,
maybe complex, regexes and ugly looking, nested ifs, i would suggest
defining a grammar and do parsing the input once with registered hooks
for your matching expressions.
SimpleParse (http://simpleparse.sourceforge.net) with a
DispatchProcessor or pyparsing (http://pyparsing.wikispaces.com/) in
combination with setParseAction or something similar are your friends
for such a task.
Steffen
More information about the Python-list
mailing list