the stupid encoding problem to stdout
Ben Finney
ben+python at benfinney.id.au
Wed Jun 8 22:39:44 EDT 2011
Sérgio Monteiro Basto <sergiomb at sapo.pt> writes:
> ./test.py
> moçambique
> moçambique
In this case your terminal is reporting its encoding to Python, and it's
capable of taking the UTF-8 data that you send to it in both cases.
> ./test.py > output.txt
> Traceback (most recent call last):
> File "./test.py", line 5, in <module>
> print u
> UnicodeEncodeError: 'ascii' codec can't encode character
> u'\xe7' in position 2: ordinal not in range(128)
In this case your shell has no preference for the encoding (since you're
redirecting output to a file).
In the first print statement you specify the encoding UTF-8, which is
capable of encoding the characters.
In the second print statement you haven't specified any encoding, so the
default ASCII encoding is used.
Moral of the tale: Make sure an encoding is specified whenever data
steps between bytes and characters.
> Don't seems logic, when send things to a file the beaviour change.
They're different files, which have been opened with different
encodings. If you want a different encoding, you need to specify that.
--
\ “There's no excuse to be bored. Sad, yes. Angry, yes. |
`\ Depressed, yes. Crazy, yes. But there's no excuse for boredom, |
_o__) ever.” —Viggo Mortensen |
Ben Finney
More information about the Python-list
mailing list