[Tutor] Unicode problems
Kent Johnson
kent37 at tds.net
Tue Aug 29 23:09:34 CEST 2006
Ed Singleton wrote:
> I've been having unicode problems in python on Mac OS 10.4.
>
> I googled for it and found a good page in Dive Into Python that I
> thought might help
> (http://www.diveintopython.org/xml_processing/unicode.html).
>
> I tried following the instructions and set my default encoding using a
> sitecustomize.py, but got the following:
>
>
>>>> import sys
>>>> sys.getdefaultencoding()
>>>>
> 'utf-8'
>
>>>> s = u'La Pe\xf1a'
>>>> print s
>>>>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in
> position 5: ordinal not in range(128)
>
>
> As I understand it, that should work. I tried using different
> character sets (like latin-1, etc), but none of them work.
>
I'm not sure Dive into Python is correct. Here is what I get on Windows:
In [1]: s = u'La Pe\xf1a'
In [2]: print s
La Peña
In [3]: import sys
In [4]: sys.getdefaultencoding()
Out[4]: 'ascii'
In [5]: sys.stdout.encoding
Out[5]: 'cp437'
I think print converts to the encoding of sys.stdout, not the default
encoding. What is the value of sys.stdout.encoding on your machine?
Kent
> The main problem I am having is in getting python not to give an
> error when it encounters a sterling currency sign (£, pound sign here
> in UK), which I suspect might be some wider problem on the mac as when
> I type that character in the terminal it shows a # (but in Python it
> shows a £).
Where is the pound sign coming from? What encoding is it in? What do you
mean, in Python it shows £? You said Python gives an error...Fixing your
first problem may not help this one without a bit more digging... (BTW
in the US a # is sometimes called a 'pound sign', maybe the computer is
trying to translate for you ;) - though it is for pound weight, not
pound sterling.)
Kent
More information about the Tutor
mailing list