[Python-Dev] more unicode: \U support?
M.-A. Lemburg
mal@lemburg.com
Thu, 27 Jul 2000 22:29:59 +0200
Tim Peters wrote:
>
> [/F]
> > would it be a good idea to add \UXXXXXXXX
> > (8 hex digits) to 2.0?
> >
> > (only characters in the 0000-ffff range would
> > be accepted in the current version, of course).
I don't really get the point of adding \uXXXXXXXX when the
internal format used is UTF-16 with support for surrogates.
What should \u12341234 map to in a future implementation ?
Two Python (UTF-16) Unicode characters ?
> [Tim]
> > In which case there seems darned little point to it now <wink/frown>.
>
> [/F]
> > with Python's approach to escape codes, it's not exactly easy
> > to *add* a new escape code -- you risk breaking code that for
> > some reason (intentional or not) relies on u"\U12345678" to end
> > up as a backslash followed by 9 characters...
> >
> > not very likely, but I've seen stranger things...
>
> Ah! You're right, I'm wrong. +1 on \U12345678 now.
See
http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#100850
for how Java defines \uXXXX...
We're following an industry standard here ;-)
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/