UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

Fredrik Lundh fredrik at pythonware.com
Thu Oct 19 11:37:41 CEST 2006


NoelByron at gmx.net wrote:

  > I'm struggling with the conversion of a UTF-8 string to latin-1. As far
> as I know the way to go is to decode the UTF-8 string to unicode and
> then encode it back again to latin-1?
> 
> So I tried:
> 
> 'K\xc3\xb6ni'.decode('utf-8')     # 'K\xc3\xb6ni' should be 'König',

"Köni", to be precise.

> contains a german 'umlaut'
> 
> but failed since python assumes every string to decode to be ASCII?

that should work, and it sure works for me:

 >>> s = 'K\xc3\xb6ni'.decode('utf-8')
 >>> s
u'K\xf6ni'
 >>> print s
Köni

what did you do, and how did it fail?

</F>




More information about the Python-list mailing list