
On Wed, 10 Aug 2016 at 11:16 Steve Dower steve.dower@python.org wrote:
[SNIP]
Finally, the encoding of stdin, stdout and stderr are currently (correctly) inferred from the encoding of the console window that Python is attached to. However, this is typically a codepage that is different from the system codepage (i.e. it's not mbcs) and is almost certainly not Unicode. If users are starting Python from a console, they can use "chcp 65001" first to switch to UTF-8, and then *most* functionality works (input() has some issues, but those can be fixed with a slight rewrite and possibly breaking readline hooks).
It is also possible for Python to change the current console encoding to be UTF-8 on initialize and change it back on finalize. (This would leave the console in an unexpected state if Python segfaults, but console encoding is probably the least of anyone's worries at that point.) So I'm proposing actively changing the current console to be Unicode while Python is running, and hence sys.std[in|out|err] will default to utf-8.
So that's a broad range of changes, and I have little hope of figuring out all the possible issues, back-compat risks, and flow-on effects on my own. Please let me know (either on-list or off-list) how a change like this would affect your projects, either positively or negatively, and whether you have any specific experience with these changes/fixes and think they should be approached differently.
To summarise the proposals (remembering that these would only affect Python 3.6 on Windows):
[SNIP]
- force the console encoding to UTF-8 on initialize and revert on finalize
Don't have enough Windows experience to comment on the other parts of this proposal, but for the console encoding I am a hearty +1 as I'm tired of Unicode characters failing to show up in the REPL.