[Tutor] output not in ANSI, conversing char set to locale.getpreferredencoding()
Peter Otten
__peter__ at web.de
Tue Aug 14 16:03:46 CEST 2012
leon zaat wrote:
> I get the error:
> UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7:
> ordinal not in range(128) for the openbareruimtenaam=u'' +
> (openbareruimtenaam1.encode(chartype)) line.
The error message means that database.select() returns a byte string.
bytestring.encode(encoding)
implicitly attempts
bytestring.decode("ascii").encode(encoding)
and will fail for non-ascii bytestrings no matter what encoding you pass to
the encode() method.
> I know that the default system codecs is ascii and chartype=b'cp1252'
> But how can i get the by pass the ascii encoding?
You have to find out the database encoding -- then you can change the
failing line to
database_encoding = ... # you need to find out yourself, but many use the
# UTF-8 -- IMO the only sensible choice these days
file_encoding = "cp1252"
openbareruimtenaam = openbareruimtenaam1.decode(
database_encoding).encode(file_encoding)
As you now have a bytestring again you can forget about codecs.open() which
won't work anyway as the csv module doesn't support unicode properly in
Python 2.x (The csv documentation has the details).
PS: the u"..." prefix is a way to write unicode constants in Python
sourcecode, you cannot create unicode a variable by tucking it in front of a
string.
u"" + bytestring
will trigger a decode
u"" + bytestring.decode("ascii")
and is thus an obcure way to spell
bytestring.decode("ascii")
More information about the Tutor
mailing list