[Python-Dev] Matching multiple regexes

MRAB python at mrabarnett.plus.com
Wed Mar 3 22:34:14 CET 2010


I've thought of a possible additional feature for my regex module and
I'm looking for opinions.

Occasionally there's a question about matching multiple regexes and
someone might suggest combining them into one regex using "|".

The feature would be to allow regex.compile, etc, to accept a list or
tuple of regex strings. These would be combined into a single regex
internally. The match object would get a new attribute .index which
would give the index of the regex string which matched.

An additional feature could be to allow regex.sub to accept as a
replacement any object which has the __getitem__ method and call that
method with the match index, if multiple regexes were used. (A
replacement string would still behave the same.)

This could mean that the following code would become possible:

     >>> subs = [("foo", "bar"), ("baz", "quux"), ("quuux", "foo")]
     >>> old_string, new_string = zip(*subs)
     >>> s = "fooxxxbazyyyquuux"
     >>> regex.sub(old_string, new_string, s)
     "barxxxquuxyyyfoo"



More information about the Python-Dev mailing list