Tim Peters wrote:
[MAL, to Skip]
Huh ? That should not be possible ! Python literals are still ASCII.
ümlaut = 'ümlaut' File "<stdin>", line 1 ümlaut = 'ümlaut' ^ SyntaxError: invalid syntax
That was Guido's intent, and what the Ref Man says, but the tokenizer uses C's isalpha() so in reality it's locale-dependent. I think at least one German on Python-Dev has already threatened to kill him if he ever fixes this bug <wink>.
Wasn't me for sure... even in the Unicode age, I believe that Python source code should maintain readability by not allowing all alpha(numeric) characters for use in identifiers (there are lots of them in Unicode). Shouldn't we fix the tokenizer to explicitely check for 'a'...'z' and 'A'...'Z' ?! (same for digits) ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/