[Python-Dev] regexp in Python
mike.klaas at gmail.com
Fri Mar 23 18:59:34 CET 2007
On 3/23/07, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Bartlomiej Wolowiec wrote:
> > For some time I'm interested in regular expressions and Finite State Machine.
> > Recently, I saw that Python uses "Secret Labs' Regular Expression Engine",
> > which very often works too slow. Its pesymistic time complexity is O(2^n),
> > although other solutions, with time complexity O(n*m) ( O(n*m^2), m is the
> > length of the regular expression and n is the length of the text,
> > introduction to problem: http://swtch.com/~rsc/regexp/regexp1.html )
> that article almost completely ignores all the subtle capturing and left-
> to-right semantics that a perl-style engine requires, though. trust me,
> this is a much larger can of worms than you might expect. but if you're
> ready to open it, feel free to hack away.
> > major part of regular expressions
> the contrived example you used has nothing whatsoever to do with
> "major part of regular expressions" as seen in the wild, though. I'd
> be much more interested in optimizations that focuses on patterns
> you've found in real code.
A fruitful direction that is not as ambitious as re-writing the entire
engine would be to add "independent group" assertions to python's RE
syntax [ (?> ... ) in perl]. They are rather handy for optimizing the
malperforming cases alluded to here (which rarely occur as the OP
posted, but tend to crop up in less malignant forms).
More information about the Python-Dev