[Python-Dev] New regex module for 3.2?

Collin Winter collinwinter at google.com
Tue Jul 13 01:11:14 CEST 2010


On Mon, Jul 12, 2010 at 8:18 AM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
> On 12/07/2010 15:07, Nick Coghlan wrote:
>>
>> On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Aprano<steve at pearwood.info>
>>  wrote:
>>
>>>
>>> On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote:
>>>
>>>>>
>>>>> re2 comparison is interesting from the point of if it should be
>>>>> included in stdlib.
>>>>>
>>>>
>>>> Is "it" re2 or regex? I don't see having 2 regular expression engines
>>>> in the stdlib.
>>>>
>>>
>>> There's precedence though... the old regex engine and the new re engine
>>> were side-by-side for many years before regex was deprecated and
>>> finally removed in 2.5. Hypothetically, re2 could similarly be added to
>>> the standard library while re is deprecated.
>>>
>>
>> re2 deliberately omits some features for efficiency reasons, hence is
>> not even on the table as a possible replacement for the standard
>> library version. If someone is in a position where re2 can solve their
>> problems with the re module, they should also be in a position where
>> they can track it down for themselves.
>>
>>
>
> If it has *partial* compatibility, and big enough performance improvements
> for common cases, it could perhaps be used where the regex doesn't use
> unsupported features. This would have some extra cost in the compile phase,
> but would mean Python could ship with two regex engines but only one
> interface exposed to the programmer...

FWIW, this has all been discussed before:
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/3829265. In
particular, I still believe that, "it's not obvious that enough Python
regexes would benefit from re2's performance/restrictions tradeoff to
make such a hybrid system worthwhile in the long term. (There is no
representative corpus of real-world Python regexes weighted for
dynamic execution frequency to use in assessing such tradeoffs
empirically like there is for JavaScript.)"

Collin

>> MRAB's module offers a superset of re's features rather than a subset
>> though, so once it has had more of a chance to bake on PyPI it may be
>> worth another look.
>>
>> Cheers,
>> Nick.
>>
>>
>
>
> --
> http://www.ironpythoninaction.com/
> http://www.voidspace.org.uk/blog
>
> READ CAREFULLY. By accepting and reading this email you agree, on behalf of
> your employer, to release me from all obligations and waivers arising from
> any and all NON-NEGOTIATED agreements, licenses, terms-of-service,
> shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
> non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have
> entered into with your employer, its partners, licensors, agents and
> assigns, in perpetuity, without prejudice to my ongoing rights and
> privileges. You further represent that you have the authority to release me
> from any BOGUS AGREEMENTS on behalf of your employer.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/collinwinter%40google.com
>


More information about the Python-Dev mailing list