[Python-Dev] should we keep the \xnnnn escape in unicode strings?
Sun, 16 Jul 2000 14:19:22 -0700
To me, this says punt the \x construct from Unicode objects altogether. If
it is broken, then why try to retain it?
I *do* find it useful in the regular string objects. For Unicode, I would
totally understand needing to use \u instead.
On Sun, Jul 16, 2000 at 02:14:02PM -0400, Tim Peters wrote:
> > for maximum compatibility with 8-bit strings and SRE,
> > let's change "\x" to mean "binary byte" in unicode string
> > literals too.
> > Hmm, this is probably not in sync with C9X (see section 188.8.131.52),
> The behavior of \x in C9X is nearly incomprehensible -- screw it.
> > but then perhaps we should depreciate usage of \xXX in the context
> > of Unicode objects altogether. Our \uXXXX notation is far
> > superior to what C9X tries to squeeze into \x (IMHO at least).
> \x is a hack inherited from the last version of C, put in back when they
> knew they had to do *something* to support "big characters" but had no real
> idea what. C9X was not allowed to break anything in the std it built on, so
> they kept all the old implementation-defined \x behavior, and made it even
> more complicated so it would make some kind sense with the new C9X character
> Python is stuck trying to make sense out of its ill-considered adoption of
> old-C's \x notation too. Letting it mean "a byte" regardless of context
> should make it useless enough that people will eventually learn to avoid it
> Note that C9X also has \u and \U notations, and \u in C9X means what it does
> in Python, except that C9X explicitly punts on what happens for \u values in
> these (inclusive) ranges:
> \u0000 - \u0020
> \u007f - \u009f
> \ud800 - \udfff
> \U is used in C9X for 8-digit (hex) characters, deferring to ISO 10646.
> If C9X didn't *have* to keep \x around, I'm sure they would have tossed it.
> Python-Dev mailing list
Greg Stein, http://www.lyra.org/