[Python-ideas] Matching multiple regex patterns simultaneously

Mathias Panzenböck grosser.meister.morti at gmx.net
Tue Mar 2 23:32:42 CET 2010

On 03/02/2010 09:39 PM, Andrey Fedorov wrote:
> So a couple of libraries (Django being the most popular that comes to
> mind) try to match a string against several regex expressions. I'm
> wondering if there exists a library to "merge" multiple compiled regex
> expressions into a single lookup. This could be exposed in a interface like:
>     http://gist.github.com/319905
> So for an example:
> rd = ReDict()
> rd['^foo$'] = 1
> rd['^bar*$'] = 2
> rd['^bar$'] = 3
> assert rd['foo'] == [1]
> assert rd['barrrr'] == [2]
> assert rd['bar'] == [2,3]
> The naive implementation I link is obviously inefficient. What would be
> the easiest way to go about compiling a set of regex-es together, so
> that they can be matched against a string at the same time? Are there
> any standard libraries that do this I'm not aware of?
> Cheers,
> Andrey

You can do something like this:
 >>> r.match('barrrr').groupdict()
{'a': None, 'c': 'bar', 'b': 'barrrr'}
 >>> r.match('bar').groupdict()
{'a': None, 'c': 'bar', 'b': 'bar'}
 >>> r.match('foo').groups()
('foo', None, None)

Ok, it's not 100% the same (it does not match 'ba'), but I think this should cover most cases where 
you want something like this. Hmm, well. You should resolve it to a form where there are no 
overlappings in the subexpressions:


