Python UTF-8 and codecs
serge.orlov at gmail.com
Tue Jun 27 22:29:51 CEST 2006
On 6/27/06, Mike Currie <dev at null.com> wrote:
> I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in
> them. Every configuration I try I get a UnicodeError: ascii codec can't
> decode byte 0x85 in position 255: oridinal not in range(128)
> I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', errors='strict')
> and that doesn't work and I've also try wrapping the file in an utf8_writer
> using codecs.lookup('utf8')
> Any clues?
Use unicode strings for non-ascii characters. The following program "works":
c1 = unichr(0x85)
f = codecs.open('foo.txt', 'wU', 'utf-8')
But unichr(0x85) is a control characters, are you sure you want it?
What is the encoding of your data?
More information about the Python-list