[New-bugs-announce] [issue34515] lib2to3: support non-ASCII identifiers

monson report at bugs.python.org
Mon Aug 27 02:47:47 EDT 2018


New submission from monson <holymonson at gmail.com>:

Python 3.0 introduces additional characters from outside the ASCII range (see PEP 3131). see https://docs.python.org/3/reference/lexical_analysis.html#identifiers

But lib2to3 can't tokenize them corretly.
```
$ echo '中 = 1' | python3.7 -m lib2to3.pgen2.tokenize
1,0-1,1:	ERRORTOKEN	'中'
1,2-1,3:	OP	'='
1,4-1,5:	NUMBER	'1'
1,5-1,6:	NEWLINE	'\n'
2,0-2,0:	ENDMARKER	''
```
'中' should be tokenized as NAME instead of ERRORTOKEN.

----------
components: Library (Lib)
messages: 324148
nosy: monson
priority: normal
severity: normal
status: open
title: lib2to3: support non-ASCII identifiers
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34515>
_______________________________________


More information about the New-bugs-announce mailing list