[docs] Python 3 doc update (re) symbolic group: name validity.

Patrice Joulain pat.jln at hotmail.fr
Fri Jan 13 17:24:34 CET 2012


I'd like to report a bug dealing with update from Python 2 to Python 3, in the Python 3.x Documentation.

Python Standard Library / Chapter 6.2.1 Regular Expression Syntax - says "Group names must be valid Python identifiers" which is not true for 3.x .
(?P<name>...) accepts Python 2 valid identifier only (contrary to named fields of collections.namedtuple).

I've checked re-HOWTO and tutorial: no specification about valididity of group name, hence no mistake.

>>> re.match(r'(?P<DejaVu>test)','test')

<_sre.SRE_Match object at 0x00BF25A0>

>>> assert 'DéjàVu'.isidentifier()

>>> re.match(r'(?P<DéjàVu>test)','test')

Traceback (most recent call last):

  File "C:\Python32\lib\functools.py", line 176, in wrapper

    result = cache[key]

KeyError: (<class 'str'>, '(?P<DéjàVu>test)', 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "C:\Python32\lib\re.py", line 153, in match

    return _compile(pattern, flags).match(string)

  File "C:\Python32\lib\re.py", line 255, in _compile

    return _compile_typed(type(pattern), pattern, flags)

  File "C:\Python32\lib\functools.py", line 180, in wrapper

    result = user_function(*args, **kwds)

  File "C:\Python32\lib\re.py", line 267, in _compile_typed

    return sre_compile.compile(pattern, flags)

  File "C:\Python32\lib\sre_compile.py", line 491, in compile

    p = sre_parse.parse(p, flags)

  File "C:\Python32\lib\sre_parse.py", line 692, in parse

    p = _parse_sub(source, pattern, 0)

  File "C:\Python32\lib\sre_parse.py", line 315, in _parse_sub

    itemsappend(_parse(source, state))

  File "C:\Python32\lib\sre_parse.py", line 552, in _parse

    raise error("bad character in group name")

sre_constants.error: bad character in group name


=> From Lib\sre_parse.py:

def isident(char):

    return "a" <= char <= "z" or "A" <= char <= "Z" or char == "_"

def isdigit(char):

    return "0" <= char <= "9"

def isname(name):
    # check that group name is a valid stringEnvoyer
    if not isident(name[0]):
        return False
    for char in name[1:]:
        if not isident(char) and not isdigit(char):
            return False
    return True

Thank you for dealing this report (Sorry for my bad english).

Waiting for some response: <pat.jln at hotmail.fr>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/docs/attachments/20120113/88ba3920/attachment.html>

More information about the docs mailing list