[Tutor] i18n on Entry widgets

Jorge Louis de Castro jobauk at hotmail.com
Wed Aug 17 21:15:33 CEST 2005


Hi, thanks for the reply.


However, I get strange behavior when I try to feed text that must be unicode 
to altavista for translation.
Just before sending, I've got the following on the log using

print "RECV DATA: ", repr(data)

and after entering "então" ("so" in Portuguese)

RECV DATA:  'right: ent\xc3\xa3o?'
Sent Message to Client Nr.  1
CONTENT:  ['right', ' ent\xc3\xa3o?']

Above before the CONTENT printout, there is a data.split(":")

Now right before sending the data to be translated by altavista I print out 
from the CONTENT[1] which yields:

Translating:   então?

Which I find odd. Obvisouly, feeding this into babelfish results in a failed 
translation. So before sending I try to encode it like you suggest.

try:
  print "Translating: ", content[1]
  decoded = content[1].encode('utf8')
  print "Decoding Prior to Translating: ", decoded
except Exception, e:
  print "EXCEPTION ENCODING ", e

try:
  translated = translate(decoded, src_l, dest_l)
except Exception, e:
  print "EXCEPTION TRANSLATING ", e
  translated = "translation failed"


The Exception thrown is:

EXCEPTION ENCODING  'ascii' codec can't decode byte 0xc3 in position 4: 
ordinal
not in range(128)


I was dealing w/ a Ascii string and was asking it to be encoded in UTF, 
whereas Python is telling me he can't encode it in UTF?? Makes little sense 
to me.

Chrs
j.


>From: Kent Johnson <kent37 at tds.net>
>To: jorge at bcs.org.uk
>CC: tutor at python.org
>Subject: Re: [Tutor] i18n on Entry widgets
>Date: Wed, 17 Aug 2005 13:27:24 -0400
>
>Jorge Louis de Castro wrote:
>>Hi,
>>
>>How do I set the encoding of a string? I'm reading a string on a Entry 
>>widget and it may use accents and other special characters from languages 
>>other than English.
>>When I send the string read through a socket the socket is automatically 
>>closed. Is there a way to encode any special characters on a string?
>
>First you have to know what the encoding is of the string you get from the 
>Entry. IIRC a Tkinter widget will give you an ASCII string if possible, 
>otherwise a Unicode string. You could check this by
>  print repr(data)
>where data is the string you get from the Entry.
>
>Next you have to encode the unicode string to the encoding you want on the 
>socket. If you want utf-8, you would use
>  socket_data = data.encode('utf-8')
>This will work if data is ASCII or Unicode. There are many other supported 
>encodings; see http://docs.python.org/lib/standard-encodings.html for a 
>list.
>
>Kent




More information about the Tutor mailing list