IDLE raw_input() and unicode
unendliche at hanmail.net
Thu Jun 5 05:00:54 CEST 2003
My setting is IDLE 0.8 on Python 2.2.2, WinXP. In case it's relevant,
here's Korea. (Hangeul is Korean writing.)
IDLE can print Hangeul just fine. (Note: my sitecustomize.py does
sys.setdefaultencoding("utf-8"). In case it's relevant.)
But IDLE fail to get Hangeul input with raw_input(). It baffles
and prints out:
Traceback (most recent call last):
File "<pyshell#0>", line 1, in ?
TypeError: object.readline() returned non-string
After some study, I found that it seems bulitin raw_input() does
sys.stdin.readline() *AND* type-checking. (Whoa... since C code
is involved, there's no traceback and I'm not sure. Should I read
C code on Python CVS? *normal user shudder*)
sys.stdin is replaced with an instance of PyShell.PyShell by IDLE.
And readline() method of PyShell class does something I don't understand
# PyShell.py around line 475
line = self.text.get("iomark", "end-1c")
PyShell inherits from OutputWindow and it in turn inherits from
EditorWindow and EditorWindow initializes self.text as Tkinter.Text
widget. And I don't know what the hell "self.tk.call(self._w, 'get',
index1, index2)", that is, an implementation of Tkinter.Text.get(),
does at all, but I assume it returns sort of "non-string".
So I opened DOS command line window and started python. And typed:
shell = PyShell.PyShell()
shell.reading = 1
# Typing some Hangeul. In this case, the name of Python itself.
It prints out: u'\ud30c\uc774\uc36c\n'. So some sort of "non-string"
is actually unicode.
So... I suggest the following:
1) Make raw_input() able to return unicode. Why not? (But I suspect
there may be some deep reason.)
2) Or, at least, make PyShell.readline() returns other than "non-string".
I think just changing "return line" to "return str(line)" would do.
(Make a change and try again.) Yes, it does.
I googled with "IDLE raw_input unicode", and to my surprise, just few
posts I found. Some German one posted raw_input() doesn't handle umlauts.
So it seems this is quite general i18n problem.
Should I submit... *normal user shudder* a patch? It's just one-line
change... Or can someone do all unicode-IDLE-users a favor and submit
I'm more than happy to hear something like "Yes, we know that, and it's
all fixed on IDLE version >0.8, and with Python 2.3 you will have no
-- Seo Sanghyeon
More information about the Python-list