Need help on UNICODE conversion

Sun Sep 7 03:24:55 EDT 2003

Erik Max Francis <max at alcyone.com> writes:

> >>> u = unicode(codecs.BOM_UTF16_BE + u, 'utf-16')
> >>> u
> u'Kommentar Unicode *\xe4\xf6\xfc\xc4\xd6\xdc\xdf*\r\n\r\n'
> 
> ... which I can convert to Latin-1 and print to then see the umlauts and
> the double S.

It is better to use "utf-16-be" as the codec name in the first place,
instead of artificially prepending a BOM, and letting the UTF-16 codec
determine byte order.

Regards,
Martin