Tkinter - non-ASCII characters in text widgets problem

Thu Jun 25 20:09:15 EDT 2009

Sebastian Pająk wrote:
>> Can, but should not.
>> I read that the problem is when using the Polish language only. Otherwise
>> things work normally. Is that correct?
> 
> Yes, correct
> 
>> If so then byte swap may be a problem.  Using the u'string' should solve
>> that. I am assuming you have the Polish alphabet working correctly on your
>> machine. I think I read that was so in an earlier posting.
>>
>> Are there any problems with his alphabet scrambling on your machine?
>> If so that needs investigating.  Here I assume you are reading Polish from
>> him on your machine and not a network translator version.
>>
> 
> The original thread is here:
> http://mail.python.org/pipermail/python-list/2009-June/717666.html
> I've explained the problem there
================
I re-read the posting. (Thanks for the link)

You do not mention if he has sent you any Polish words and if they
appear OK on your machine.

A note here:  In reading the original posting I get symbols that are not
familiar to me as alphabet.
 From the line in your original:
      Label(root, text='ęóąśłżźćń').pack()
I see text='
            then an e with a goatee
                 a  capitol O with an accent symbol on top (')
                 an a with a tail on the right
                 a  s with an accent on top
                 an I do no not know what - maybe some sort of l with a
                                            slash through the middle
                 a  couple of z with accents on top
                 a  capitol C with an accent on top
                 a  n with a short bar on top

I put the code into python and took a look.

I get:
cat xx

# -*- coding: utf-8 -*-

import sys
from Tkinter import *

root = Tk()

Label(root, text='\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144').pack()
Button(root,
text='\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144').pack()
Entry(root).pack()

root.mainloop()

Then:
python xx
   File "xx", line 10
SyntaxError: Non-ASCII character '\xf3' in file xx on line 10, but no
encoding declared; see http://www.python.org/peps/pep-0263.html for details

So I did.
It notes Window$ puts things into those lines. Namely:
"To aid with platforms such as Windows, which add Unicode BOM marks
     to the beginning of Unicode files, the UTF-8 signature
     '\xef\xbb\xbf' will be interpreted as 'utf-8' encoding as well
     (even if no magic encoding comment is given).
"

Then I took out the o with the accent and re-ran the file.

Everything works except the text is exactly as shown above. That is:
\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144
(shows twice as directed, one for label, one for button, no apostrophes)

OK - now I take a look at what in actually in the file.
in MC on Linux Slackware 10.2 I read, in the mail folder,
0119 capitol A with a tilde on top.
HEX readings beginning at the 0119\...
30 31 31 39 C3 B3 5C

but in the python file xx, I read:
30 31 31 39 5C
0119\...

I would have to say the mail system is screwing you up.  Might try 
zipping the file and sending it that way and see if problem changes.

Steve