file - codecs - unicode ???

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Mon Mar 12 23:44:17 CET 2001


Mon, 12 Mar 2001 16:29:42 +0100, Sébastien Libert <sebastien.libert at comexis.com> pisze:

> >>> line
> '\377\3760\000.\0000\0000\000\011\000i\000n\000f\000o\000\011\000L\000o\000g
> \000_\000S\000t\000a\000n\000d\000a\000r\000d\000\011\000n\000g\000L\000o\00
> 0g\000\015\000\012'
> 
> What can i do with this kind of thing ?????

It's encoded in UTF-16, so:
    unicode(line, 'UTF-16').encode('ASCII')
except that the '\012' at the end is bogus. It shouldn't be there.

You may want to use a different encoding than ASCII, because ASCII
is only able to encode Latin letters without accents. If the string
contains other character, you will get an exception.

-- 
 __("<  Marcin Kowalczyk * qrczak at knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK



More information about the Python-list mailing list