UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

Duncan Booth duncan.booth at invalid.invalid
Thu Oct 19 05:42:19 EDT 2006


NoelByron at gmx.net wrote:

> 'K\xc3\xb6ni'.decode('utf-8')     # 'K\xc3\xb6ni' should be 'König',
> contains a german 'umlaut'
> 
> but failed since python assumes every string to decode to be ASCII?

No, Python would assume the string to be utf-8 encoded in this case:

>>> 'K\xc3\xb6ni'.decode('utf-8').encode('latin1')
'K\xf6ni'

Your code must have failed somewhere else. Try posting actual failing code 
and actual traceback.




More information about the Python-list mailing list