[Python-Dev] unicode/string asymmetries

Thomas Heller thomas.heller@ion-tof.com
Thu, 10 Jan 2002 22:21:27 +0100


From: "Martin v. Loewis" <martin@v.loewis.de>
> > >    unicode("some string", "unicode-escape")
> [...]
> > For example the copyright symbol "=A9" (repr("=A9") gives "\xa9").
> > Now I want to convert this string to unicode.
> > u"=A9" works fine, unicode(variable) gives an ASCII decoding error.
>
> As I said: unicode-escape is the precise encoding that is used to
> parse Unicode strings from source files. It interprets all bytes above
> 128 as Latin-1.
>
I must apologize, because first it didn't seem to work:

>>> print unicode("\xa9", "unicode-escape")

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
>>>

but then I found out that the result simply cannot be printed out,
while the repr of it can be:

>>> unicode("\xa9", "unicode-escape")
u'\xa9'
>>>

Thanks,

Thomas