Python 3 doc update (re) symbolic group: name validity.

Hello, I'd like to report a bug dealing with update from Python 2 to Python 3, in the Python 3.x Documentation. Python Standard Library / Chapter 6.2.1 Regular Expression Syntax - says "Group names must be valid Python identifiers" which is not true for 3.x . (?P<name>...) accepts Python 2 valid identifier only (contrary to named fields of collections.namedtuple). I've checked re-HOWTO and tutorial: no specification about valididity of group name, hence no mistake.
re.match(r'(?P<DejaVu>test)','test')
<_sre.SRE_Match object at 0x00BF25A0>
assert 'DéjàVu'.isidentifier()
re.match(r'(?P<DéjàVu>test)','test')
Traceback (most recent call last): File "C:\Python32\lib\functools.py", line 176, in wrapper result = cache[key] KeyError: (<class 'str'>, '(?P<DéjàVu>test)', 0) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python32\lib\re.py", line 153, in match return _compile(pattern, flags).match(string) File "C:\Python32\lib\re.py", line 255, in _compile return _compile_typed(type(pattern), pattern, flags) File "C:\Python32\lib\functools.py", line 180, in wrapper result = user_function(*args, **kwds) File "C:\Python32\lib\re.py", line 267, in _compile_typed return sre_compile.compile(pattern, flags) File "C:\Python32\lib\sre_compile.py", line 491, in compile p = sre_parse.parse(p, flags) File "C:\Python32\lib\sre_parse.py", line 692, in parse p = _parse_sub(source, pattern, 0) File "C:\Python32\lib\sre_parse.py", line 315, in _parse_sub itemsappend(_parse(source, state)) File "C:\Python32\lib\sre_parse.py", line 552, in _parse raise error("bad character in group name") sre_constants.error: bad character in group name
=> From Lib\sre_parse.py: def isident(char): return "a" <= char <= "z" or "A" <= char <= "Z" or char == "_" def isdigit(char): return "0" <= char <= "9" def isname(name): # check that group name is a valid stringEnvoyer if not isident(name[0]): return False for char in name[1:]: if not isident(char) and not isdigit(char): return False return True Thank you for dealing this report (Sorry for my bad english). Waiting for some response: <pat.jln@hotmail.fr>

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Patrice, thanks for your report, this is now (belatedly) fixed in the code and in future Python versions all valid Python identifiers will be allowed in group names. cheers, Georg Am 13.01.2012 17:24, schrieb Patrice Joulain:
*Hello,
I'd like to report a bug dealing with update from Python 2 to Python 3, in the Python 3.x Documentation.* * Python Standard Library / Chapter 6.2.1 Regular Expression Syntax - says "Group names must be valid Python identifiers" which is not true for 3.x . (?P<name>...) **accepts Python 2 valid identifier only (contrary to named fields of collections.namedtuple)*. ***I've checked re-HOWTO and tutorial: no specification about valididity of group name, hence no mistake.*
re.match(r'(?P<DejaVu>test)','test') <_sre.SRE_Match object at 0x00BF25A0>
assert 'DéjàVu'.isidentifier()
re.match(r'(?P<DéjàVu>test)','test') Traceback (most recent call last): File "C:\Python32\lib\functools.py", line 176, in wrapper result = cache[key] KeyError: (<class 'str'>, '(?P<DéjàVu>test)', 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python32\lib\re.py", line 153, in match return _compile(pattern, flags).match(string) File "C:\Python32\lib\re.py", line 255, in _compile return _compile_typed(type(pattern), pattern, flags) File "C:\Python32\lib\functools.py", line 180, in wrapper result = user_function(*args, **kwds) File "C:\Python32\lib\re.py", line 267, in _compile_typed return sre_compile.compile(pattern, flags) File "C:\Python32\lib\sre_compile.py", line 491, in compile p = sre_parse.parse(p, flags) File "C:\Python32\lib\sre_parse.py", line 692, in parse p = _parse_sub(source, pattern, 0) File "C:\Python32\lib\sre_parse.py", line 315, in _parse_sub itemsappend(_parse(source, state)) File "C:\Python32\lib\sre_parse.py", line 552, in _parse raise error("bad character in group name") sre_constants.error: bad character in group name
*=> From Lib\sre_parse.py:*
def isident(char): return "a" <= char <= "z" or "A" <= char <= "Z" or char == "_"
def isdigit(char): return "0" <= char <= "9"
def isname(name): # check that group name is a valid stringEnvoyer <#> if not isident(name[0]): return False for char in name[1:]: if not isident(char) and not isdigit(char): return False return True
*Thank you for dealing this report (Sorry for my bad english)*.
*Waiting for some response: <pat.jln@hotmail.fr>*
_______________________________________________ docs mailing list docs@python.org http://mail.python.org/mailman/listinfo/docs
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlFqebQACgkQN9GcIYhpnLB4jACfa4+03ZE6Azw240vMwQo7xIQz 2jMAoIbpVNcT+EbIcfO7H2K914VneTgF =Aojl -----END PGP SIGNATURE-----
participants (2)
-
Georg Brandl
-
Patrice Joulain