[Python-Dev] interesting article on regex performance

skip at pobox.com skip at pobox.com
Fri Mar 12 20:29:09 CET 2010


    >> There are major practical problems associated with making such a leap
    >> directly (Google's re2 engine is in C++ rather than C and we'd have
    >> to keep the existing implementation around regardless to handle the
    >> features that re2 doesn't support).

    Collin> I don't see why C++ would be a deal-breaker in this case, since
    Collin> it would be restricted to an extension module.

Traditionally Python has run on some (minority) platforms where C++ was
unavailable.  While the re module is a dynamically linked extension module
and thus could be considered "optional", I doubt anybody thinks of it as
optional nowadays.  It's used in the regression test suite anyway.  It would
be tough to run unit tests on such minority platforms without it.  You'd
have to maintain both the current sre implementation and the new re2
implementation for a long while into the future.

As I was reading the code I thought, "Great! This stuff is so simple.  It's
even all written in C."  Then I looked at the re2 page.  :-(

Skip


More information about the Python-Dev mailing list