[Python-ideas] allow `lambda' to be spelled λ

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Wed Jul 20 12:44:12 EDT 2016


Nick Coghlan writes:

 > The reason that can help is that the main problem with "improving"
 > error messages, is that it can be really hard to tell whether the
 > improvements are actually improvements or not

Personally, I think the real issue here is that the curly quote (and
things like mathematical PRIME character) are easily confused with
Python syntax and it all looks like grit on Tim's monitor.  I tried
substituting an emoticon and the DOUBLE INTEGRAL, and it was quite
obvious what was wrong from the Python 3 error message.<wink/>

However, in this case, as far as I can tell from the error messages
induced by playing with ASCII, Python 3.5 thinks that all non-
identifier ASCII characters are syntactic (so for example it says that

    with open($file.txt") as f:

is "invalid syntax").  But for non-ASCII characters (I guess including
the Latin 1 set?) they are either letters, numerals, or just plain not
valid in a Python program AIUI (outside of strings and comments, of
course).

I would think the lexer could just treat each invalid character as an
invalid_token, which is always invalid in Python syntax, and the error
would be a SyntaxError with the message formatted something like

    "invalid character {} = U+{:04X}".format(ch, ord(ch))

This should avoid the strange placement of the position indicator,
too.

If someday we decide to use an non-ASCII character for a syntactic
purpose, that's a big enough compatibility break in itself that
changing the invalid character set (and thus the definition of
invalid_token) is insignificant.

I'm pretty sure this is what a couple of earlier posters have in mind,
too.



More information about the Python-ideas mailing list