Tkinter - non-ASCII characters in text widgets problem
spconv+m at gmail.com
Fri Jun 26 17:27:26 CEST 2009
Maybe this picture will tell you more:
The original script:
May someone can confirm this osx behaviour?
2009/6/26 Sebastian Pająk <spconv+m at gmail.com>:
> 2009/6/26 norseman <norseman at hughes.net>:
>> Sebastian Pająk wrote:
>>>> Can, but should not.
>>>> I read that the problem is when using the Polish language only. Otherwise
>>>> things work normally. Is that correct?
>>> Yes, correct
>>>> If so then byte swap may be a problem. Using the u'string' should solve
>>>> that. I am assuming you have the Polish alphabet working correctly on
>>>> machine. I think I read that was so in an earlier posting.
>>>> Are there any problems with his alphabet scrambling on your machine?
>>>> If so that needs investigating. Here I assume you are reading Polish
>>>> him on your machine and not a network translator version.
>>> The original thread is here:
>>> I've explained the problem there
>> I re-read the posting. (Thanks for the link)
>> You do not mention if he has sent you any Polish words and if they
>> appear OK on your machine.
> He has sent my a polish words, they appear correct. We both have the
> english version of systems (they are both set to polish locale (time,
> dates, keyboard etc.))
>> A note here: In reading the original posting I get symbols that are not
>> familiar to me as alphabet.
>> From the line in your original:
>> Label(root, text='ęóąśłżźćń').pack()
>> I see text='
>> then an e with a goatee
>> a capitol O with an accent symbol on top (')
>> an a with a tail on the right
>> a s with an accent on top
>> an I do no not know what - maybe some sort of l with a
>> slash through the middle
>> a couple of z with accents on top
>> a capitol C with an accent on top
>> a n with a short bar on top
>> I put the code into python and took a look.
>> I get:
>> cat xx
>> # -*- coding: utf-8 -*-
>> import sys
>> from Tkinter import *
>> root = Tk()
>> Label(root, text='\u0119ó\u0105\u015b\u0142\u017c\u017a\u0107\u0144').pack()
>> python xx
>> File "xx", line 10
>> SyntaxError: Non-ASCII character '\xf3' in file xx on line 10, but no
>> encoding declared; see http://www.python.org/peps/pep-0263.html for details
>> So I did.
>> It notes Window$ puts things into those lines. Namely:
>> "To aid with platforms such as Windows, which add Unicode BOM marks
>> to the beginning of Unicode files, the UTF-8 signature
>> '\xef\xbb\xbf' will be interpreted as 'utf-8' encoding as well
>> (even if no magic encoding comment is given).
>> Then I took out the o with the accent and re-ran the file.
>> Everything works except the text is exactly as shown above. That is:
>> (shows twice as directed, one for label, one for button, no apostrophes)
>> OK - now I take a look at what in actually in the file.
>> in MC on Linux Slackware 10.2 I read, in the mail folder,
>> 0119 capitol A with a tilde on top.
>> HEX readings beginning at the 0119\...
>> 30 31 31 39 C3 B3 5C
>> but in the python file xx, I read:
>> 30 31 31 39 5C
>> I would have to say the mail system is screwing you up. Might try zipping
>> the file and sending it that way and see if problem changes.
> I've tried zipping
> It looks like you you didn't save the script in UTF-8. Try to run the
> original script file from attachment (UTF-8 without BOM).
> ps. Do you have mac os x? It would be better if someone with mac tested it
> # -*- coding: utf-8 -*-
> import sys
> from Tkinter import *
> root = Tk()
> root.tk.call('encoding', 'system', 'utf-8')
> Label(root, text=u'ęóąśłżźćń').pack()
> Button(root, text=u'ęóąśłżźćń').pack()
More information about the Python-list