On Tue, Nov 02, 2021 at 05:55:55PM +0200, Serhiy Storchaka wrote:
All control characters except CR, LF, TAB and FF are banned outside comments and string literals. I think it is worth to ban them in comments and string literals too. In string literals you can use backslash-escape sequences, and comments should be human readable, there are no reason to include control characters in them. There is a precedence of emitting warnings for some superficial escapes in strings.
Agreed. I don't think there is any good reason for including control characters (apart from whitespace) in comments. In strings, I would consider allowing VT (vertical tab) as well, that is whitespace.
'\v'.isspace() True
But I don't have a strong opinion on that. [Petr]
For homoglyphs/confusables, should there be a SyntaxWarning when an identifier looks like ASCII but isn't?
Let's not enshrine as a language "feature" that non Western European languages are dangerous second-class citizens.
It would virtually ban Cyrillic. There is a lot of Cyrillic letters which look like Latin letters, and there are complete words written in Cyrillic which by accident look like other words written in Latin.
Agreed.
It is a work for linters, which can have many options for configuring acceptable scripts, use spelling dictionaries and dictionaries of homoglyphs, etc.
Linters and editors. I have no objection to people using editors that highlight non-ASCII characters in blinking red letters, so long as I can turn that option off :-) -- Steve