How to emit Cyrillic and Chinese via unicode from console mode?

rs387 rstarkov at gmail.com
Sun Sep 14 04:02:39 EDT 2008


On Sep 14, 2:03 am, "Siegfried Heintze" <siegfr... at heintze.com> wrote:
> Can someone point me to an example of a little program that emits non-ascii
> Unicode characters (Russian or Chinese perhaps)?

The following doesn't quite work, but I'll post it anyway since it
actually ends up printing the characters. Perhaps someone can point
out what causes the exception at the end?

The important thing is to set the console codepage to 65001, which is
UTF-8. This lets you output utf8-encoded text and see the Unicode
chars displayed.



import sys
import encodings.utf_8
import win32console

sys.stdout = encodings.utf_8.StreamWriter(sys.stdout)

win32console.SetConsoleCP(65001)
win32console.SetConsoleOutputCP(65001)

s = "English: ok\n"
s += u'Russian: \u0420\u0443\u0441\u0441\u043a\u0438\u0439\n'
s += u'Greek: \u03bc\u03b5\u03b3\u03b1\u03bb\u03cd
\u03c4\u03b5\u03c1\u03b7\n'

print s



If redirected to file, all is well, this prints everything properly in
UTF-8. If ran on the console, this also prints everything correctly,
but then throws a mysterious exception:

English: ok
Russian: Русский
Greek: μεγαλύτερη
Traceback (most recent call last):
  File "I:\Temp\utf8console.py", line 18, in <module>
    print s
  File "C:\Progs\Python25\lib\codecs.py", line 304, in write
    self.stream.write(data)
IOError: [Errno 0] Error

Any ideas?

Roman

P.S. This really ought to Just Work in this day and age, and do so
without all those 65001/utf8 incantations. Pity that it doesn't. Sigh.



More information about the Python-list mailing list