Unicode, stdout, and stderr
Peter Otten
__peter__ at web.de
Tue Jul 22 05:09:37 EDT 2014
Frank Millman wrote:
>
> "Peter Otten" <__peter__ at web.de> wrote in message
> news:lql3am$2q7$1 at ger.gmane.org...
>> Frank Millman wrote:
>>
>>> Hi all
>>>
>>> This is not important, but I would appreciate it if someone could
>>> explain the following, run from cmd.exe on Windows Server 2003 -
>>>
>>> C:\>python
>>> Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32
>>> bit (In
>>> tel)] on win32
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> x = '\u2119'
>>>>>> x # this uses stderr
>>> '\u2119'
>>
>> No, both print to stdout, but just
>>
>>>>> x
>>
>> is passed to the display hook of the interactive interpreter. This
>> applies
>> repr() and then tries to print the result. If this fails it makes
>> another effort, roughly (the actual code is written in C)
>>
>> sys.stdout.buffer.write(repr(x).encode(
>> sys.stdout.encoding, "backslashreplace"))
>>
>>
>
> Thanks, Peter. Very interesting.
>
> Out of interest, does the same thing happen when writing to sys.stderr?
If you are asking about the fallback mechanism, that is specific to
sys.displayhook in the interactive interpreter.
But stdout and stderr do handle errors differently:
>>> import sys
>>> sys.stdout.errors
'strict'
>>> sys.stderr.errors
'backslashreplace'
So a codepoint written to stdout that cannot be encoded with stdout.encoding
raises an error while a codepoint written to stderr that cannot be encoded
with stderr.encoding is escaped.
Another way to make stdout more forgiving:
>>> import sys
>>> print("\u2119")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/encodings/cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2119' in
position 0: character maps to <undefined>
>>> sys.stdout = open(1, mode="w", errors="xmlcharrefreplace",
encoding=sys.stdout.encoding, closefd=False)
>>> print("\u2119")
ℙ
More information about the Python-list
mailing list