[Tutor] Printing Chinese characters?
Alfred Milgrom
fredm at smartypantsco.com
Fri Oct 17 03:43:34 EDT 2003
Hi:
Let me start by saying I just love programming in Python. I love its
philosophy, its ease of use, the obvious productivity, the third-party
support, the code libraries, and so on.
(As an aside, I thought list comprehension was pretty great, but now that I
have discovered iterators, it's even better:)
But just as important as the power of the language, the support given by
people on this forum is incredible. So I just want to start off by saying
thanks to all of you.
As far as my Chinese text string is concerned, the string is part of
comments in a Go game problem from a Chinese web site (Go is an ancient
oriental board game, also known as weiqi in China and baduk in Korea).
My first problem was converting the string into unicode. Now that I have
access to the CJK encodings (thanks Danny), I believe that the coding is
mainly 'chinese' rather than 'big5'. But there could be some special
Japanese Go terms in there as well :((
The first two Chinese characters might be:
u'\u9ed1' Black
u'\u5408' combine
(Translation made using unihan.txt - thanks Neal for pointing me in that
direction)
This makes sense in the context of the problem and would translate as
'Black to join his groups' (or similar).
I haven't figured out what to do with the '?' characters, and haven't
decided if they are punctuation of some kind or an escape character or
whatever. And I don't know when to check for Japanese characters, either.
So I am not sure about the following characters yet. Some characters are
definitely not 'chinese' encoding.
But now I have enough ammunition to get me going forward.
(Terry: When I get enough confidence in my decoding, I will get in touch
with you concerning your unihan Python lookup module. Thanks.)
As a final aside, I know that many people prefer other editors rather than
IDLE, but IDLE can't be beaten in this situation. There is no need for
other GUIs, web browsers, etc. Because IDLE is written in Tkinter, it
automatically displays unicode characters properly!
Thanks again,
Fred Milgrom
More information about the Tutor
mailing list