print vs sys.stdout.write, and UnicodeError
Brent Lievers
3wbl at qlink.queensu.ca
Thu Oct 25 15:32:51 EDT 2007
Martin Marcher <martin at marcher.name> wrote:
> 25 Oct 2007 17:37:01 GMT, Brent Lievers <3wbl at qlink.queensu.ca>:
>> Greetings,
>>
>> I have observed the following (python 2.5.1):
>>
>> >>> import sys
>> >>> print sys.stdout.encoding
>> UTF-8
>> >>> print(u'\u00e9')
>> ?
>> >>> sys.stdout.write(u'\u00e9\n')
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>> position 0: ordinal not in range(128)
>
>>>> sys.stdout.write(u'\u00e9\n'.encode("UTF-8"))
> ?
>
>> Is this correct? My understanding is that print ultimately calls
>> sys.stdout.write anyway, so I'm confused as to why the Unicode error
>> occurs in the second case. Can someone explain?
>
> you forgot to encode what you are going to "print" :)
Thanks. I obviously have a lot to learn about both Python and Unicode ;-)
So does print do this encoding step based on the value of
sys.stdout.encoding? In other words, something like:
sys.stdout.write(textstr.encode(sys.stdout.encoding))
I'm just trying to understand why encode() is needed in the one case but
not the other.
Cheers,
Brent
More information about the Python-list
mailing list