decode unicode string using 'unicode_escape' codecs

aurora aurora00 at gmail.com
Fri Jan 13 20:57:43 EST 2006


Cool, it works! I have also done some due diligence that the utf-8  
encoding would not introduce any Python escape accidentially. I have  
written a recipe in the Python cookbook:

Efficient character escapes decoding
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466293

wy

> Does this do what you want?
>
>  >>> u'€\\n€'
> u'\x80\\n\x80'
>  >>> len(u'€\\n€')
> 4
>  >>> u'€\\n€'.encode('utf-8').decode('string_escape').decode('utf-8')
> u'\x80\n\x80'
>  >>>  
> len(u'€\\n€'.encode('utf-8').decode('string_escape').decode('utf-8'))
> 3
>
> Basically, I convert the unicode string to bytes, escape the bytes using  
> the 'string_escape' codec, and then convert the bytes back into a  
> unicode string.
>
> HTH,
>
> STeVe




More information about the Python-list mailing list