Looking for an appropriate encoding standard that supports all languages

Ata Jafari a.j.romanista at gmail.com
Mon Aug 23 09:34:26 EDT 2010


On Aug 20, 10:04 pm, Thomas Jollans <tho... at jollybox.de> wrote:
> On Thursday 19 August 2010, it occurred to ata.jaf to exclaim:
>
>
>
> > On Aug 17, 11:55 pm, Thomas Jollans <tho... at jollybox.de> wrote:
> > > On Tuesday 17 August 2010, it occurred to ata.jaf to exclaim:
> > > > I am developing a little program in Mac with wxPython.
> > > > But I have problems with the characters that are not in ASCII. Like
> > > > some special characters in French or Turkish.
> > > > So I am looking for a way to solve this. Like an encoding standard
> > > > that supports all languages. Or some other way.
>
> > > Anything that supports all of Unicode will do. Like UTF-8. If your text
> > > is mostly Latin, then just go for UTF-8, if you use other alphabets
> > > extensively, you might want to consider UTF-16, which might the use a
> > > little less space.
>
> > OK, I used UTF-8.
> > I write a line of strings in the source code and I want my program to
> > show that as an output on GUI. And this line of strings includes a
> > character like "ü". But I see that in GUI this character is replaced
> > with another strange characters. I mean it doesn't work.
> > And when I try to use UTF-16, I get an syntax error that declares
> > "UTF-16 stream does not start with BOM".
>
> I get the feeling you're not actually using the encoding you say you're using,
> or not telling every program involved what you're doing.
>
> 1. Save the file in the correct encoding. Either tell your text editor to use
> a specific encoding (UTF-8 would be a good choice), or find out what encoding
> your text editor is using and use that encoding during the rest of the
> process.
>
> 2. Tell Python which encoding you're using. The coding: line will do the
> trick, *provided* you don't lie, and the encoding your specify in the file is
> actually the encoding you're using to store the file on disk.
>
> 3. Instruct your GUI library to do the right thing. If you use unicode strings
> (either by using Python 3 or by using the u"Käse" syntax in Python 2), that
> should be enough, otherwise, if you're using byte strings, which you shouldn't
> be doing in this case, you might have to tell the library what you're doing,
> or use the customary encoding. (For GTK+, this is UTF-8. For other libraries,
> it might be Latin-1, or system-dependent)

Finally I did it.
I was doing some stupid mistakes.
Thanks alot.
Ata



More information about the Python-list mailing list