On 01. 11. 21 18:32, Serhiy Storchaka wrote:
This is excellent!
01.11.21 14:17, Petr Viktorin пише:
CPython treats the control character NUL (``\0``) as end of input, but many editors simply skip it, possibly showing code that Python will not run as a regular part of a file.
It is an implementation detail and we will get rid of it. It only happens when you read the Python script from a file. If you import it as a module or run with runpy, the NUL character is an error.
That brings us to possible changes in Python in this area, which is an interesting topic. As for \0, can we ban all ASCII & C1 control characters except whitespace? I see no place for them in source code. For homoglyphs/confusables, should there be a SyntaxWarning when an identifier looks like ASCII but isn't? For right-to-left text: does anyone actually name identifiers in Hebrew/Arabic? AFAIK, we should allow a few non-printing "joiner"/"non-joiner" characters to make it possible to use all Arabic words. But it would be great to consult with users/teachers of the languages. Should Python run the bidi algorithm when parsing and disallow reordered tokens? Maybe optionally?