UTF-8 output problems
Laurent Pointal
laurent.pointal at wanadoo.fr
Sat Mar 10 09:32:31 EST 2007
Michael B. Trausch wrote:
> I am having a slight problem with UTF-8 output with Python. I have the
> following program:
>
> x = 0
>
> while x < 0x4000:
> print u"This is Unicode code point %d (0x%x): %s" % (x, x,
> unichr(x))
> x += 1
>
> This program works perfectly when run directly:
>
> mbt at pepper:~/tmp$ python test.py
> This is Unicode code point 0 (0x0):
> This is Unicode code point 1 (0x1):
> This is Unicode code point 2 (0x2):
> This is Unicode code point 3 (0x3):
> This is Unicode code point 4 (0x4):
> This is Unicode code point 5 (0x5):
> This is Unicode code point 6 (0x6):
> This is Unicode code point 7 (0x7):
> This is Unicode code point 8 (0x8):
> This is Unicode code point 9 (0x9):
> This is Unicode code point 10 (0xa):
> (... continued)
>
> However, when I attempt to redirect the output to a file:
>
> mbt at pepper:~/tmp$ python test.py >f
> Traceback (most recent call last):
> File "test.py", line 6, in <module>
> print u"This is Unicode code point %d (0x%x): %s" % (x, x,
> unichr(x))
> UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in
> position 39: ordinal not in range(128)
>
> This is slightly confusing to me. The output goes all the way to the
> end of the program when it is not redirected. Why is Python treating
> the situation differently when the output is redirected? This failure
> occurs for all redirection, by the way: >, >>, 1>2, pipes, and so forth.
>
> Any ideas?
In complement to Marc reply, you can open a file with a specific encoding
(see codecs.open() function), and use print >> f,... to fill that file.
A+
Laurent.
More information about the Python-list
mailing list