Can't print national characters in IDLE with Python 2.2.1c1

Wed Mar 20 01:33:05 EST 2002

Magnus Lyckå <magnus at thinkware.se> writes:

> Can someone explain this?

It's a known limitation, also it is not clear what the solution should
be.

If you type "funny characters" in IDLE, Tk will represent them as
Unicode strings (in fact, it represents *all* strings as Unicode
strings). When Tkinter returns strings to Python, it returns plain
(byte) strings if the string can be converted using the default
encoding (ASCII in most installations), and Unicode objects otherwise.
When Python then tries to interpret a Unicode string as source code,
it gives up - that text is no valid source code.

Notice that, strictly speaking, your program is incorrect: According
to

http://www.python.org/doc/current/ref/lexical.html

# Python uses the 7-bit ASCII character set for program text and
# string literals. 8-bit characters may be used in string literals and
# comments but their interpretation is platform dependent; the proper
# way to insert 8-bit characters in string literals is by using octal
# or hexadecimal escape sequences.

There is no easy solution to this. Just consider the fragment

 >>> s = 'åäö'
 >>> print ord(s[0])

What do you want to be printed here (what number)? Assuming you have
some answer (say, 229), then what would you expect if s contained some
Cyrillic or Japanese characters?

Under PEP 263, some of the current restrictions will be removed, so
that you can put those characters into Unicode literals. Putting them
into string literals still won't be supported.

As a work-around, you can change the system default encoding.

Regards,
Martin