[Python-Dev] interesting article on regex performance

Jared Grubb jared.grubb at gmail.com
Sat Mar 13 02:59:12 CET 2010


On 12 Mar 2010, at 15:22, skip at pobox.com wrote:
> 
>  Collin> re2 is not a full replacement for Python's current regex
>  Collin> semantics: it would only serve as an accelerator for a subset of
>  Collin> the current regex language. Given that, it makes perfect sense
>  Collin> that it would be optional on such minority platforms (much like
>  Collin> the incoming JIT).
> 
> Sure, but over the years Python has supported at least four different
> regular expression modules that I'm aware of (regex, regexp, and the current
> re module with different extension modules underneath it, perhaps there were
> others).  During some of that time more than one module was distributed with
> Python proper.  I think the desire today would be that only one regular
> expression module be distributed with Python (that would be my vote anyway).
> Getting people to move off the older libraries was difficult.  If re2 can't
> replace sre under the covers than I think it belongs in PyPI, not the Python
> distribution.  That said, that suggests to me that a different NFA or DFA
> implementation written in C would replace sre, one not written in C++.

re2 would be a supplement to re -- it is not a replacement, and Python would run fine if it's not present on some platforms. 

It's like a floating-point processor: you can do all math you need with just an integer processor. But if you have an FPU present, then it makes sense to use it for the FP operations. 

Jared


More information about the Python-Dev mailing list