[Python-3000] Invalid \U escape in source code give hard-to-trace error

Guido van Rossum guido at python.org
Sun Jul 15 16:17:00 CEST 2007


When a source file contains a string literal with an out-of-range \U
escape (e.g. "\U12345678"), instead of a syntax error pointing to the
offending literal, I get this, without any indication of the file or
line:

UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
position 0-9: illegal Unicode character

This is quite hard to track down. (Both the location of the bad
literal in the source file, and the origin of the error in the parser.
:-) Can someone come up with a fix?

I note that raw escapes show a slightly different error. I also note
that the same issue exists for u"..." literals in Python 2.5.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list