Tkinter - non-ASCII characters in text widgets problem

norseman norseman at hughes.net
Thu Jun 25 20:09:15 EDT 2009


Sebastian Pająk wrote:
>> Can, but should not.
>> I read that the problem is when using the Polish language only. Otherwise
>> things work normally. Is that correct?
> 
> Yes, correct
> 
>> If so then byte swap may be a problem.  Using the u'string' should solve
>> that. I am assuming you have the Polish alphabet working correctly on your
>> machine. I think I read that was so in an earlier posting.
>>
>> Are there any problems with his alphabet scrambling on your machine?
>> If so that needs investigating.  Here I assume you are reading Polish from
>> him on your machine and not a network translator version.
>>
> 
> The original thread is here:
> http://mail.python.org/pipermail/python-list/2009-June/717666.html
> I've explained the problem there
================
I re-read the posting. (Thanks for the link)

You do not mention if he has sent you any Polish words and if they
appear OK on your machine.

A note here:  In reading the original posting I get symbols that are not
familiar to me as alphabet.
 From the line in your original:
      Label(root, text='ęóąśłżźćń').pack()
I see text='
            then an e with a goatee
                 a  capitol O with an accent symbol on top (')
                 an a with a tail on the right
                 a  s with an accent on top
                 an I do no not know what - maybe some sort of l with a
                                            slash through the middle
                 a  couple of z with accents on top
                 a  capitol C with an accent on top
                 a  n with a short bar on top

I put the code into python and took a look.



I get:
cat xx

# -*- coding: utf-8 -*-

import sys
from Tkinter import *

root = Tk()

Label(root, text='\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144').pack()
Button(root,
text='\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144').pack()
Entry(root).pack()

root.mainloop()

Then:
python xx
   File "xx", line 10
SyntaxError: Non-ASCII character '\xf3' in file xx on line 10, but no
encoding declared; see http://www.python.org/peps/pep-0263.html for details

So I did.
It notes Window$ puts things into those lines. Namely:
"To aid with platforms such as Windows, which add Unicode BOM marks
     to the beginning of Unicode files, the UTF-8 signature
     '\xef\xbb\xbf' will be interpreted as 'utf-8' encoding as well
     (even if no magic encoding comment is given).
"

Then I took out the o with the accent and re-ran the file.

Everything works except the text is exactly as shown above. That is:
\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144
(shows twice as directed, one for label, one for button, no apostrophes)

OK - now I take a look at what in actually in the file.
in MC on Linux Slackware 10.2 I read, in the mail folder,
0119 capitol A with a tilde on top.
HEX readings beginning at the 0119\...
30 31 31 39 C3 B3 5C

but in the python file xx, I read:
30 31 31 39 5C
0119\...

I would have to say the mail system is screwing you up.  Might try 
zipping the file and sending it that way and see if problem changes.


Steve




More information about the Python-list mailing list