[Python-Dev] regex module
Guido van Rossum
guido at python.org
Wed Jan 13 18:58:04 CET 2010
Memories of days past... Python had several regular expression
implementations before, one of which was called "regex".
But I would rather not have a new module -- I would much rather have a
flag specifying the new (backwards incompatible) syntax/semantics. The
flag would have a long name (e.g. re.NEW_SYNTAX), a short name (e.g.
re.N) and an inline syntax, "(?n)...".
On Tue, Jan 12, 2010 at 7:58 PM, Brett Cannon <brett at python.org> wrote:
> On Tue, Jan 12, 2010 at 14:10, MRAB <python at mrabarnett.plus.com> wrote:
>> Hi all,
>> I'm back on the regex module after doing other things and I'd like your
>> opinion on a number of matters:
>> Firstly, the current re module has a bug whereby it doesn't split on
>> zero-width matches. The BDFL has said that this behaviour should be
>> retained by default in case any existing software depends on it. My
>> question is: should my regex module still do this for Python 3?
>> Speaking personally, I'd like it to behave correctly, and Python 3 is
>> the version where backwards-compatibility is allowed to be broken.
> If it is a separate module under a different name it can do the proper
> thing. People will just need to be aware of the difference when they import
> the module.
>> Secondly, Python 2 is reaching the end of the line and Python 3 is the
>> future. Should I still release a version that works with Python 2? I'm
>> thinking that it could be confusing if new regex module did zero-width
>> splits correctly in Python 3 but not in Python 2. And also, should I
>> release it only for Python 3 as a 'carrot'?
> That's totally up to you. There is practically no chance of it getting into
> the 2.x under the stdlib at this point since 2.7b1 is coming up and this
> module has not been out in the wild for a year (to my knowledge). If you
> want to support 2.x that's fine and I am sure users would appreciate it, but
> it isn't necessary to get into the Python 3 stdlib.
>> Finally, the module allows some extra backslash escapes, eg \g<name>, in
>> the pattern. Should it treat ill-formed escapes, eg \g, as it would have
>> treated them in the re module?
> If you want to minimize the differences then it should probably match. As I
> said, since it is a different name to import under it can deviate where
> reasonable, just make sure to clearly document the deviations.
>> Python-Dev mailing list
>> Python-Dev at python.org
> Python-Dev mailing list
> Python-Dev at python.org
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev