[Python-ideas] Fix default encodings on Windows

eryk sun eryksun at gmail.com
Fri Aug 12 08:31:19 EDT 2016


Thu, Aug 11, 2016 at 6:41 PM, Adam Bartoš <drekin at gmail.com> wrote:
> The transcoding wrappers with 'utf-8' encoding are used just as a work
> around the fact that Python tokenizer cannot use utf-16-le and that the
> readlinehook machinery is unfortunately bytes-based. The tanscoding wrapper
> just has encoding 'utf-8' and no buffer attribute, so there is no actual
> transcoding in sys.std* objects. It's just a signal for PyOS_Readline
> consumers, and the transcoding occurs in a custom readline hook. Nothing
> like this would be needed if PyOS_Readline was replaced by some Python API
> wrapper around sys.readlinehook that would be Unicode string based.

If win_unicode_console gets added to the standard library, I think it
should provide at least a std*.buffer interface that transcodes
between UTF-16 and UTF-8 (with errors='replace'), to make this as much
of a drop-in replacement as possible. I know it's not required. For
example, IDLE doesn't implement this. But I'm also sure there's code
out there that uses stdout.buffer, including in the standard library.
It's mostly test code (not including cases for piping output from a
child process) and simple script interfaces, but if we don't have to
break people's code, we really shouldn't.


More information about the Python-ideas mailing list