[Python-Dev] regex module
MRAB
python at mrabarnett.plus.com
Wed Jan 13 04:09:34 CET 2010
MRAB wrote:
> Hi all,
>
> I'm back on the regex module after doing other things and I'd like your
> opinion on a number of matters:
>
> Firstly, the current re module has a bug whereby it doesn't split on
> zero-width matches. The BDFL has said that this behaviour should be
> retained by default in case any existing software depends on it. My
> question is: should my regex module still do this for Python 3?
> Speaking personally, I'd like it to behave correctly, and Python 3 is
> the version where backwards-compatibility is allowed to be broken.
>
> Secondly, Python 2 is reaching the end of the line and Python 3 is the
> future. Should I still release a version that works with Python 2? I'm
> thinking that it could be confusing if new regex module did zero-width
> splits correctly in Python 3 but not in Python 2. And also, should I
> release it only for Python 3 as a 'carrot'?
>
> Finally, the module allows some extra backslash escapes, eg \g<name>, in
> the pattern. Should it treat ill-formed escapes, eg \g, as it would have
> treated them in the re module?
>
I've just noticed something odd about the re module: the sub() method
doesn't take 'pos' or 'endpos' arguments. search() does; match() does;
findall() does(); finditer() does; but sub() doesn't. Maybe there has
never been a demand for it. (Nor split(), for that matter.)
More information about the Python-Dev
mailing list