[Python-Dev] \u and \U escapes in raw unicode string literals

Guido van Rossum guido at python.org
Fri May 11 04:11:48 CEST 2007


On 5/10/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Martin v. Löwis wrote:
> > why should you be able to get a non-ASCII character
> > into a raw Unicode string?
>
> The analogous question would be why can't you get a
> non-Unicode character into a raw Unicode string. That
> wouldn't make sense, since Unicode strings can't even
> hold non-Unicode characters (or at least they're not
> meant to).
>
> But it doesn't seem unreasonable to want to put
> Unicode characters into a raw Unicode string. After
> all, if it only contains ASCII characters there's
> no need for it to be a Unicode string in the first
> place.

This is what prompted my question, actually: in Py3k, in the
str/unicode unification branch, r"\u1234" changes meaning: before the
unification, this was an 8-bit string, where the \u was not special,
but now it is a unicode string, where \u *is* special.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list