[Python-Dev] what is happening with the regex module going into Python 3.3?

Gregory P. Smith greg at krypto.org
Mon Jun 4 00:02:32 CEST 2012


On Sun, Jun 3, 2012 at 2:38 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Mon, Jun 4, 2012 at 6:25 AM, Gregory P. Smith <greg at krypto.org> wrote:
> >
> > On Fri, Jun 1, 2012 at 5:37 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> >>
> >> ipaddress really made it in because I personally ran into the
> limitations
> >> of not having IP address support in the stdlib. I ended up doing quite
> a bit
> >> of prompting to ensure the process of cleaning up the API to modern
> stdlib
> >> standards didn't stall (even now, generating a module reference from the
> >> docstrings is still a pending task)
> >>
> >> With regex, the pain isn't there, since re already covers such a large
> >> subset of what regex provides.
> >
> > That last statement basically suggests that something like regex would
> never
> > be accepted until a CPython core developer was actually running into pain
> > with the many flaws in the re module (especially when it comes to
> Unicode).
> >  I disagree with that.
>
> No, that's not really what I meant. Driving integration of a module
> takes *time* and *effort*. The decision to commit that effort has to
> be driven by something, and personal annoyance is a great motivator.
> In the case of PEP 3144, I happened to be in a position to do
> something about a gap in the standard library after the omission was
> made glaringly obvious [1].
>
> Getting this done was a combined effort from Peter (in getting the
> module API updated), myself and others (esp. Antoine) in reviewing the
> reference implementation's API and requesting changes and more
> recently Sandro Tosi has been doing most of the heavy lifting in
> getting the docs up to scratch.
>
> > Per the language summit, I think we need to just do it.  Put it in as re
> and
> > rename the existing re module to sre.
>
> No. We almost burned Jesse out dropping multiprocessing into 2.6 at
> the last minute, and many longstanding issues with that module are
> only being addressed now that Richard has the time to be involved
> again. SRE already suffers from a lack of maintenance, and we've had
> zero indication that regex will make that situation better (and
> several indications that it will actually make it worse. Matthew's
> silence on the topic is *not* encouraging, and nobody else has even
> volunteered to write a PEP, let alone agree to maintain the module).
>
> > We could pull the plug on it and leave it out if substantial as yet
> unknown
> > problems that can't be fixed in time for release crop up during the beta
> 1
> > or 2 (release manager's decision).
>
> Unwinding changes to the build process is yet more work that may not
> be needed. We need to remember the purpose of the standard library:
> most of the time, it is *not* intended to be all things to all people.
> The status quo is that, if you're doing basic, primarily ASCII,
> regular expression processing, then "import re" will serve you just
> fine. If you're doing more than that, then you'll probably need to do
> "pip install regex" (or platform specific equivalent) and change your
> import to "import regex as re".
>
> That's not *great* (as the number of open Unicode bugs against SRE can
> attest), but it's far from unworkable. I consider it preferable to
> adding yet another big ball of C code to the stdlib in the absence of
> a PEP addressing the concerns already raised.
>
> >> My perspective is that it's now too late to make a change that big for
> >> 3.3, but the in principle approval holds for anyone that wants to work
> with
> >> MRAB and get the idea written up as a PEP for 3.4.
> >
> > Nonsense, as long as its in before 3.3 Beta 1 (scheduled for June 23rd
> > according to PEP 398) it can go in.
> >
> > I don't like to claim that a PEP for this one is strictly necessary
>
> Why not? Requiring a PEP is the norm, not the exception. Even when
> there's agreement that something *should* be done, there's plenty of
> details to be thrashed out in turning in principle agreement into a
> concrete plan of action.
>
> > but Nick
> > raises good questions to be answered and has good suggestions for what to
> > write up in the PEP in his earlier response that I certainly would
> prefer to
> > have gathered up and documented so that is the route I suggest.
> >
> > The issue seems to be primarily one of "who is volunteering to do it?"
>
> Correct, both in figuring out the integration details and in agreeing
> to maintain it in the future.
>
> Remember, now is better than never, but never is often better than
> *right* now :)
>
>
heh.  indeed.  regardless, the module is available on pypi whether it goes
in or not so we do at least have something to point people to when they
need more than the existing undermaintained re (sre) module.

There are also other options with different properties such as
http://pypi.python.org/pypi/re2/.

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120603/fe814a46/attachment.html>


More information about the Python-Dev mailing list