[Tutor] Unicode problems

Kent Johnson kent37 at tds.net
Tue Aug 29 23:09:34 CEST 2006


Ed Singleton wrote:
> I've been having unicode problems in python on Mac OS 10.4.
>
> I googled for it and found a good page in Dive Into Python that I
> thought might help
> (http://www.diveintopython.org/xml_processing/unicode.html).
>
> I tried following the instructions and set my default encoding using a
> sitecustomize.py, but got the following:
>
>   
>>>> import sys
>>>> sys.getdefaultencoding()
>>>>         
> 'utf-8'
>   
>>>> s = u'La Pe\xf1a'
>>>> print s
>>>>         
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in
> position 5: ordinal not in range(128)
>   
>
> As I understand it, that should work.  I tried using different
> character sets (like latin-1, etc), but none of them work.
>   
I'm not sure Dive into Python is correct. Here is what I get on Windows:
In [1]: s = u'La Pe\xf1a'

In [2]: print s
La Peña

In [3]: import sys

In [4]: sys.getdefaultencoding()
Out[4]: 'ascii'

In [5]: sys.stdout.encoding
Out[5]: 'cp437'

I think print converts to the encoding of sys.stdout, not the default 
encoding. What is the value of sys.stdout.encoding on your machine?

Kent
> The main problem  I am having is in getting python not to give an
> error when it encounters a sterling currency sign (£, pound sign here
> in UK), which I suspect might be some wider problem on the mac as when
> I type that character in the terminal it shows a # (but in Python it
> shows a £).

Where is the pound sign coming from? What encoding is it in? What do you 
mean, in Python it shows £? You said Python gives an error...Fixing your 
first problem may not help this one without a bit more digging... (BTW 
in the US a # is sometimes called a 'pound sign', maybe the computer is 
trying to translate for you ;) - though it is for pound weight, not 
pound sterling.)

Kent



More information about the Tutor mailing list