How to emit UTF-8 from console mode?
M8R-yfto6h at mailinator.com
Thu Oct 2 04:46:37 CEST 2008
"Siegfried Heintze" <siegfried at heintze.com> wrote in message
news:vLCdnUSj27MaCX7VnZ2dnUVZ_uGdnZ2d at comcast.com...
>>Make sure you are using the Lucida Console font for the cmd.exe window and
>>type the commands:
>>python -c "print ''.join(unichr(i) for i in range(0x410,0x431))"
> Wowa! I was not aware of that chcp command! Thanks! How could I do that
> "chcp 1251" programatically?
> The code was a little confusing because those two apostrophes look like a
> double quote!
> But what are we doing here? Can you convince me that we are emitting
> UTF-8? I need UTF-8 because I need to experiment with some OS function
> calls that give me UTF-16 and I need to emit UTF-16 or UTF-8.
> I think part of the problem is that Lucida Console is not as capable as
> "Arial Unicode MS" or the fonts used by urxvt-X.
In this case, it is not emitting UTF-8. It is emitting the windows-1251
encoding. As another poster mentioned, the Windows console gets an error
when attempting to write UTF8 when the code page is 65001 (UTF8). But you
can write output to a file explicitly in UTF-8 or UTF-16 and view the file
with Notepad. I've used this method for processing Chinese.
>>> import os,codecs
>>> data = u''.join(unichr(i) for i in range(0x410,0x431))
One way to set the code page programmatically is to use ctypes, but this
will only work in a Windows console:
>>> import ctypes
>>> print u''.join(unichr(i) for i in
More information about the Python-list