Unicode perplex
Irmen de Jong
irmen at -nospam-remove-this-xs4all.nl
Mon Jun 21 17:08:45 EDT 2004
John Roth wrote:
> Remember that the trick
> is that it's still going to have the *same* stream of
> bytes (at least if the Unicode string is implemented
> in UTF-8.)
Which it isnt't.
AFAIK Python's storage format for Unicode strings is
some form of 2-byte representation, it certainly isn't
UTF-8.
So if you want to turn your string into a Python Unicode
object, you really have to push it trough the UTF-8 codec...
--Irmen
More information about the Python-list
mailing list