Seeking regex optimizer

Mirco Wahab wahab at chemie.uni-halle.de
Mon Jun 19 17:30:31 EDT 2006


Thus spoke andrewdalke at gmail.com (on 2006-06-19 22:51):

> It uses Aho-Corasick for the implementation which is fast and does what
> you expect it to do.  Nor does it have a problem of matching more than
> 99 possible strings as the regexp approach may have.

If you pull the strings into (?>( ... )) (atomic groups),
this would't happen.

http://www.regular-expressions.info/atomic.html
   ...
   Everything between (?>) is treated as one single token
   by the regex engine, once the regex engine leaves the
   group.
   Because the entire group is one token, no backtracking
   can take place once the regex engine has found a match
   for the group.
   ...

Maybe Py.2.5 will have them?

Regards

Mrico



More information about the Python-list mailing list