string processing question

Sion Arrowsmith sion at viridian.paintbox
Fri May 1 12:46:34 CEST 2009


Kurt Mueller  <mu at problemlos.ch> wrote:
>:> python -c 'print unicode("ä", "utf8")'
>ä
>
>:> python -c 'print unicode("ä", "utf8")' | cat
>Traceback (most recent call last):
>  File "<string>", line 1, in <module>
>UnicodeEncodeError: 'ascii' codec can't encode characters in position
>0-1: ordinal not in range(128)

$ python -c 'import sys; print sys.stdout.encoding'
UTF-8
$ python -c 'import sys; print sys.stdout.encoding' | cat
None

If print gets a Unicode string, it does an implicit
.encode(sys.stdout.encoding or sys.getdefaultencoding()) on it.
If you want your output to be guaranteed UTF-8, you'll need to
explicitly .encode("utf8") it yourself.

(I dare say this is slightly different in 3.x .)

-- 
\S

   under construction




More information about the Python-list mailing list