[Python-Dev] interesting article on regex performance

Collin Winter collinwinter at google.com
Fri Mar 12 22:40:50 CET 2010


On Fri, Mar 12, 2010 at 11:29 AM,  <skip at pobox.com> wrote:
>
>    >> There are major practical problems associated with making such a leap
>    >> directly (Google's re2 engine is in C++ rather than C and we'd have
>    >> to keep the existing implementation around regardless to handle the
>    >> features that re2 doesn't support).
>
>    Collin> I don't see why C++ would be a deal-breaker in this case, since
>    Collin> it would be restricted to an extension module.
>
> Traditionally Python has run on some (minority) platforms where C++ was
> unavailable.  While the re module is a dynamically linked extension module
> and thus could be considered "optional", I doubt anybody thinks of it as
> optional nowadays.  It's used in the regression test suite anyway.  It would
> be tough to run unit tests on such minority platforms without it.  You'd
> have to maintain both the current sre implementation and the new re2
> implementation for a long while into the future.

re2 is not a full replacement for Python's current regex semantics: it
would only serve as an accelerator for a subset of the current regex
language. Given that, it makes perfect sense that it would be optional
on such minority platforms (much like the incoming JIT).

Collin


More information about the Python-Dev mailing list