AW: [PythonCE] UnicodeDecodeError with print

anne.wangnick at t-online.de anne.wangnick at t-online.de
Mon Mar 14 19:06:50 CET 2005


Hello Michael,

this can happen with "normal" Python as well, try this running python.exe
directly.
The issue is not that you create a unicode object, the issue is that you
want to print it. On the PC when using IDLE, sys.stdout.encoding is set to
"cp1252". For file objects, "when Unicode strings are written to a file,
they will be converted to byte strings using this encoding".

On the PDA, however, sys.stdout is a built-in module _pcceshell_support.
Seemingly this module tries to convert unicode objects into strings using
the "ascii" encoding, and this stumbles on the pound sign.

Thus, on the PDA, you have to take care of the conversion of unicode objects
to strings yourself before calling print. For instance:
	u = u'£'
	print u.encode("cp1252")

Now this doesn't fail, but it doesn't print a pound sign as well. But at
least it seems to do the same as if you had written:
	print '£'

[It gets more interesting when the string '£' is contained in a source file.
Try to put the following into a file using IDLE on the PC, and run it:
	# -*- coding: utf-8 -*-
	print "£".encode("hex")
You'll get c2a3. This is because IDLE itself sees the first line of your
file, and encodes the pound sign as c2a3 when storing the file.

Or try in a file:
	# -*- coding: utf-8 -*-
	print [hex(ord(c)) for c in u"£"]
This time, IDLE stored the pound sign as c2a3 again, but Python also uses
the magic first line when building the unicode object to convert c2a3 to a
unicode character.

When you use Pocket Word, or so, to create a Python source file, it will of
course not understand your magic first line and bluntly store the pound sign
as a3. Thus, all of your string literals are actually cp1252-encoded.]

Regards,
Sebastian

-----Ursprüngliche Nachricht-----
Von: pythonce-bounces at python.org [mailto:pythonce-bounces at python.org]Im
Auftrag von Michael Foord
Gesendet: Montag, 14. März 2005 13:04
An: pythonce at python.org
Betreff: [PythonCE] UnicodeDecodeError with print


I am wondering if anyone knows the reason as to why :

print u'£'

should cause a UnicodeDecodeError on pythonce ? (The usual 'ascii codec
cannot decode character...' message).

Obviously the '£' character is a non-ascii character. I am just
surprised that the print statement is using the ascii encoding at all
and not just passing the string to sys.stdout.

The particular reason I ask is that this doesn't happen with 'normal'
python... but I would like to know how the print statement decodes
unicode strings it prints. Since it *doesn't* raise an error normally it
obviously doesn't use defaultencoding - so why does the pyhonce one ?

Yours curiously,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
_______________________________________________
PythonCE mailing list
PythonCE at python.org
http://mail.python.org/mailman/listinfo/pythonce



More information about the PythonCE mailing list