<div dir="ltr"><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><pre>On 11 August 2016 at 04:10, Steve Dower <<a href="https://mail.python.org/mailman/listinfo/python-ideas">steve.dower at python.org</a>> wrote:
><i>
</i>><i> I suspect there's a lot of discussion to be had around this topic, so I want to get it started. There are some fairly drastic ideas here and I need help figuring out whether the impact outweighs the value.
</i>
My main reaction would be that if Drekin (Adam Bartoš) agrees the
changes natively solve the problems that
<a href="https://pypi.python.org/pypi/win_unicode_console">https://pypi.python.org/pypi/win_unicode_console</a> works around, it's
probably a good idea.
The status quo is also sufficiently broken from both a native Windows
perspective and a cross-platform compatibility perspective that your
proposals are highly unlikely to make things *worse* :)
Cheers,
Nick.</pre></blockquote><div><br></div><div>The main idea of win_unicode_console is simple: to use WinAPI functions ReadConsoleW and WriteConsoleW to communicate with the interactive console on Windows and to wrap this in standard Python IO hierarchy – that's why sys.std*.encoding would be 'utf-16-le': it corresponds to widechar strings used by Windows wide APIs. But this is only about sys.std*.encoding, which I think is not so imporant. AFAIK sys.std*.encoding should be used only when you want to communicate in bytes (which I think is not a good idea), so it tells you, which encoding is sys.std*.buffer assuming. In fact sys.std* may even not have the buffer attribute, so its encoding attribute would be useless in that case.<br><br></div><div>Unfortunatelly, sys.std*.encoding is used in some other places – namely by the consumers of the old PyOS_Readline API (the tokenizer and input) use it to decode the bytes returned. Actually, the consumers assume differente encodings (sys.stdin.encoding vs. sys.stdout.encoding), so it is impossible to write a correct readline hook when the encodings are not the same. So I think it would be nice to have Python and string-based implementation of readline hooks – sys.readlinehook attribute, which would use sys.std* by default on Windows and GNU readline on Unix.<br><br></div><div>Nevertheless, I think it is a good idea to have more 'utf-8' defaults (or 'utf-8-readsig' for open()). I don't know whether it helps with the console issue to open the standard streams in 'utf-8'.<br><br></div><div>Adam Bartoš<br></div></div>