[Python-Dev] IDLE and non-ASCII characters
Tim Peters
tim.one@home.com
Tue, 15 May 2001 02:28:34 -0400
[Guido]
> Postscript: using cut and paste, I *can* enter "s='äö'" in IDLE at the
> Python prompt, both on Linux and on Windows 98. It prints as
> '\xe4\xf6' on both systems. What changed?
[Martin]
> Perhaps the Tcl version? That sounds like the issue that Marc talked
> about: Tk behaves differently when text is entered programmatically
> (and perhaps through cut-n-paste), as compared to text entered through
> the keyboard. Using cut-n-paste with Tk 8.3.1, CVS python, X11R6.3 on
> Solaris 8 still gives me the UnicodeError.
I don't know which version of Python Guido used. I tried cut-&-paste of
s='äö'
from his email into the distributed 2.1 IDLE under Win98, and got
UnicodeError: ASCII encoding error: ordinal not in range(128)
Tk appears to interfere with using the usual Windows ALT+0nnn method of
entering funny characters, so unsure what happens then -- but for me it
either works fine or does something insane (moves the cursor to the left
margin, brings up an IDLE dialog box, etc).
If I open the system Character Map utility and copy-&-paste using *that*, I
can enter all sorts of stuff without problem:
>>> s = "àáâãäåæçèéêëìíîïðñòòóôõö÷øùúûüýþÿ"
>>> s
'\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef
\xf0\xf1\xf2\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
>>>
So not all clipboard entries are created equal.
Another clue: if I paste the s='äö' snippet from Guido's email into a file
opened with Notepad, then immediately copy it again from the Notepad doc,
then paste that into Idle, again no problem:
>>> s='äö'
>>> s
'\xe4\xf6'
>>>
Using a clipboard diagnostic tool I don't understand, when I copy from
Notepad these data formats are in the system clipboard:
TEXT
LOCALE
OEMTEXT
But when I copy from Guido's email under Outlook 2000, it's
DataObject
Rich Text Format
Rich Text Format Without Objects
RTF as Text
TEXT
UNICODTEXT
Ole Private Data
LOCALE
OEMTEXT
Under Character Map, it's
Rich Text Format
TEXT
LOCALE
OEMTEXT
So perhaps it's not the version of Tk but the source of the data, and that Tk
grabs an unfortunate data format (when present) from the clipboard in
preference to a fortunate one.
the-clipboard-is-a-complex-beast-ly y'rs - tim