Re: [Python-ideas] allow `lambda' to be spelled λ

July 19, 2016


      On Tue, Jul 19, 2016 at 7:21 AM Steven D'Aprano <steve@pearwood.info> wrote:
...
On Mon, Jul 18, 2016 at 10:29:34PM -0700, Rustom Mody wrote:
...
There was this question on the python list a few days ago:
Subject: SyntaxError: Non-ASCII character
[...]
I pointed out that the python2 error was more helpful (to my eyes) than
python3s
And I pointed out how I thought the Python 3 error message could be
improved, but the Python 2 error message was not very good.
...
Python3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ariston/foo.py", line 31
    wf = wave.open(“test.wav”, “rb”)
                       ^
SyntaxError: invalid character in identifier
It would be much more helpful if the caret lined up with the offending
character. Better still, if the offending character was actually stated:
wf = wave.open(“test.wav”, “rb”)
                   ^
SyntaxError: invalid character '“' in identifier
...
Python2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "foo.py", line 31
SyntaxError: Non-ASCII character '\xe2' in file foo.py on line 31, but no
encoding declared; see http://python.org/dev/peps/pep-0263/ for details
As I pointed out earlier, this is less helpful. The line itself is not
shown (although the line number is given), nor is the offending
character. (Python 2 can't show the character because it doesn't know
what it is -- it only knows the byte value, not the encoding.) But in
the person's text editor, chances are they will see what looks to them
like a perfectly reasonable character, and have no idea which is the
byte \xe2.
...
IOW
1. The lexer is internally (evidently from the error message) so
ASCII-oriented that any “unicode-junk” just defaults out to identifiers
(presumably comments are dealt with earlier) and then if that lexing
action
fails it mistakenly pinpoints a wrong *identifier* rather than just an
impermissible character like python 2
You seem to be jumping to a rather large conclusion here. Even if you
are right that the lexer considers all otherwise-unexpected characters
to be part of an identifier, why is that a problem?
It's a problem because those characters could never be part of an
identifier.  So it seems like a bug.
...
I agree that it is mildly misleading to say
invalid character '“' in identifier
when “ is not part of an identifier:
py> '“test'.isidentifier()
False
but I don't think you can jump from that to your conclusion that
Python's unicode support is somewhat "wrongheaded". Surely a much
simpler, less inflammatory response would be to say that this one
specific error message could be improved?
But... is it REALLY so bad? What if we wrote it like this instead:
py> result = my§function(arg)
  File "<stdin>", line 1
    result = my§function(arg)
                        ^
SyntaxError: invalid character in identifier
Isn't it more reasonable to consider that "my§function" looks like it is
intended as an identifier, but it happens to have an illegal character
in it?
...
combine that with
2. matrix mult (@) Ok to emulate perl but not to go outside ASCII
How does @ emulate Perl?
As for your second part, about not going outside of ASCII, yes, that is
official policy for Python operators, keywords and builtins.
...
makes it seem  (to me) python's unicode support is somewhat wrongheaded.
--
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
--
---
You received this message because you are subscribed to a topic in the
Google Groups "python-ideas" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/python-ideas/-gsjDSht8VU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
python-ideas+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Python-ideas] allow `lambda' to be spelled λ

Neil Girdhar