[Python-Dev] [patch #100912] should we keep the \xnnnn escape in unicode strings?

M.-A. Lemburg mal@lemburg.com
Sun, 16 Jul 2000 19:59:33 +0200

Fredrik Lundh wrote:
> > so let's change the question into a proposal:
> >
> >     for maximum compatibility with 8-bit strings and SRE,
> >     let's change "\x" to mean "binary byte" in unicode string
> >     literals too.
> I've prepared a small patch.  If nobody objects, I'll check
> it in next weekend, or so...
> http://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100912&group_id=5470

There were objections from Finn Bock and myself: \xXXXX is
defined to mean "read all hex chars until the next non-hex char
and then cast to the underlying type (char or wchar_t)" in C9X.
Don't know about Java... Finn ?

Not that this definition is optimal, but we should stick to what
the standard says and only depreciate usage of \xXXXX in favour
of \uXXXX. Code using escapes like "\xABCD" which then results
in "\xCD" is broken anyway -- having u"\xABCD" return "\uABCD"
wouldn't make much of a difference (+ the bug would become pretty
obvious if viewed in a Unicode aware viewer: Asian characters are
very easy to recognize in ASCII text ;-)

Would it be hard to make JPython support \uXXXX ? (Or does it
already ?)

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/