UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

Michael Ströder michael at stroeder.com
Thu Oct 19 08:46:26 EDT 2006


NoelByron at gmx.net wrote:
> 
> print 'K\xc3\xb6ni'.decode('utf-8')
> 
> and this line raised a UnicodeDecode exception.

Works for me.

Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode object. With
print this is implicitly converted to string. The char set used depends
on your console

Check this out for understanding it:

>>> u = 'K\xc3\xb6ni'.decode('utf-8')
>>> s=u.encode('iso-8859-1')
>>> u
u'K\xf6ni'
>>> s
'K\xf6ni'
>>>

Ciao, Michael.



More information about the Python-list mailing list