matching multiple regexs to a single line...

Alex Martelli aleax at aleax.it
Tue Nov 19 10:39:14 EST 2002


Alexander Sendzimir wrote:
   ...
> regexs = [
>     ( 'regex_id1', sre.compile( r'regex1' ) ),
>     ( 'regex_id2', sre.compile( r'regex2' ) ),
>     ( 'regex_id3', sre.compile( r'regex3' ) ),

Not sure why you're using sre here instead of re.  Anyway,
a MUCH faster way is to build a single RE pattern by:

    onerepat = '(' + ')|('.join([r.pattern for n, r in regexs]) + ')'
    onere = re.compile(onerepat)

of course, it would be even faster without that wasteful compile
to put the compiled-re into regexs followed by having to use the
r.pattern attribute to recover the starting pattern, but anyway...
Then, bind matchobj = onere.match(line) and check matchobj.lastindex.

This doesn't work if the patterns define groups of their own, but
a slightly more sophisticated approach can help -- use _named_
groups for the join...


Akex




More information about the Python-list mailing list