Displaying Unicode on the console (Windows)
Paul Moore
paul.moore at atosorigin.com
Mon Apr 14 11:44:05 EDT 2003
OK. I know this is a common question, and I know the answer is
basically "it's not as easy as you think" :-) But I'm confused as to
what I need to do to get a Unicode string to print on the console in
Windows.
To use a concrete example, I'd like to print the Euro symbol. A
Unicode string for this is u'\20a0'
>>> unicodedata.name(u'\u20a0')
'EURO-CURRENCY SIGN'
Now, I know I can't just print directly:
>>> print u'\u20a0'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
But what *do* I need to do? None of the obvious encodings work:
>>> print u'\u20a0'.encode("latin-15")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
LookupError: unknown encoding: latin-15
>>> print u'\u20a0'.encode("latin-1")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: Latin-1 encoding error: ordinal not in range(256)
>>> print u'\u20a0'.encode("iso8859_15")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python22\lib\encodings\iso8859_15.py", line 18, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: charmap encoding error: character maps to <undefined>
>>> print u'\u20a0'.encode("cp1258")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python22\lib\encodings\cp1258.py", line 18, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: charmap encoding error: character maps to <undefined>
>>> print u'\u20a0'.encode("mbcs")
?
I *know* I can display a Euro character on the console. Heck, just
hitting AltGr-4 or Alt-0128 on my keyboard, at the Python prompt,
displays as a Euro character. If I do
>>> print ord('')
128
Which doesn't look much like 0x20a0, so there's something odd going
on...
As I say, I *know* this is subtle. I'm happy to work out the details,
once I can get a simple example working. But how do I get started? How
do I get u'\u20a0' to display on my screen as a Euro character???
[BTW, in case it's relevant - I know it probably is - the output from
"chcp" at the console prompt is "Active code page: 1252"]
Thanks for any help,
Paul.
PS While I can live with having to know details of how the console is
configured in order to get this working, ideally I'd like to know a
way of getting anything I need out of Windows (even if it requires API
calls...), so that I can write something generic which, when given a
Unicode string, can display it on the console without needing any
extra information from the user...
More information about the Python-list
mailing list