On 5/10/07, Greg Ewing
Martin v. Löwis wrote:
why should you be able to get a non-ASCII character into a raw Unicode string?
The analogous question would be why can't you get a non-Unicode character into a raw Unicode string. That wouldn't make sense, since Unicode strings can't even hold non-Unicode characters (or at least they're not meant to).
But it doesn't seem unreasonable to want to put Unicode characters into a raw Unicode string. After all, if it only contains ASCII characters there's no need for it to be a Unicode string in the first place.
This is what prompted my question, actually: in Py3k, in the str/unicode unification branch, r"\u1234" changes meaning: before the unification, this was an 8-bit string, where the \u was not special, but now it is a unicode string, where \u *is* special. -- --Guido van Rossum (home page: http://www.python.org/~guido/)