Unicode string output

Boudewijn Rempt boud at valdyas.org
Wed Jan 24 01:43:15 EST 2001


Fredrik Lundh wrote:

> Alexander Kostyrkin wrote:
>> Surprisingly printing a unicode string that contains a Japanese kanji
>> character raises an exception
>> For example
>>
>>     print u"\u55f4"
>> UnicodeError: ASCII encoding error: ordinal not in range(128)
>>
>> Is there any way to overcome the problem?
> 
> If you don't specify what encoding to use on output, Python assumes
> you're an encoding-ignorant american programmer <wink>, and defaults
> to ASCII.
> 
> To use any other encoding, use the encode method:
> 
>     s = ...
>     print s.encode("iso-latin-1")
>     print s.encode("ascii", "ignore")
> 
> Also see the codecs modules:
> http://www.python.org/doc/current/lib/module-codecs.html
> 
> Cheers /F
> 

I came across this problem recently, too, and I didn't want to have
to remember to explicitly encode my text everywhere in the application.
But Python doesn't really support setting a default encoding for an
application, only for _all_ of Python on a machine, because 
sys.setdefaultencoding is removed after starting Python.

The solution was to add the following file to site-packages:

sitecustomize.py

import sys

sys.setappdefaultencoding=sys.setdefaultencoding

And call 

import sys
sys.setappdefaultencoding("utf-8")

first thing in the application.


Boudewijn Rempt | http://www.xs4all.nl/~bsarempt/python



More information about the Python-list mailing list