char 128? no... 256
Tim Peters
tim.one at comcast.net
Wed Feb 12 08:49:54 EST 2003
[Afanasiy]
> Yes I know I asked for a Unicode object in the example code.
> I am trying to emulate what Python is doing. And as you can
> see, it cannot be decoded.
That's right: Unicode can be *en*coded into an 8-bit representation, but
not *de*coded. 8 -> Unicode is decoding, Unicode to 8 is encoding.
unicode() is a decoding function, and is working correctly. If you want to
encode, then, for example,
>>> u'\u1234'.encode('utf-8')
'\xe1\x88\xb4'
>>>
>>> unicode('\xe1\x88\xb4', 'utf-8')
u'\u1234'
>>>
> So if I am getting a unicode object I have no way of converting it to
> ascii and am thus screwed.
There are dozens of ways to change it into an 8-bit string, depending on
which encoding scheme you choose to pass to the .encode() method.
More information about the Python-list
mailing list