Re: [Python-ideas] Fix default encodings on Windows

10 Aug 2016

      On 10 August 2016 at 19:10, Steve Dower <steve.dower@python.org> wrote:
...
To summarise the proposals (remembering that these would only affect Python
3.6 on Windows):
* change sys.getfilesystemencoding() to return 'utf-8'
* automatically decode byte paths assuming they are utf-8
* remove the deprecation warning on byte paths
* make the default open() encoding check for a BOM or else use utf-8
* [ALTERNATIVE] make the default open() encoding check for a BOM or else use
sys.getpreferredencoding()
* force the console encoding to UTF-8 on initialize and revert on finalize
So what are your concerns? Suggestions?
I presume you'd be targeting 3.7 for this change. Broadly, I'm +1 on
all of this. Personally, I'm moving to UTF-8 everywhere, so it seems
OK to me, but I suspect defaulting open() to UTF-8 in the absence of a
BOM might cause issues for people. Most text editors still (AFAIK) use
the ANSI codepage by default, and it's the one place where an
identifying BOM isn't possible. So your alternative may be a safer
choice. On the other hand, files from Unix (via say github) would
typically be UTF-8 without BOM, so it becomes a question of choosing
the best compromise. I'm inclined to go for cross-platform and UTF-8
and clearly document the change. We might want a more convenient short
form for open(filename, "r", encoding=sys.getpreferredencoding()),
though, to ease the transition... We'd also need to consider how the
new default encoding would interact with PYTHONIOENCODING.

For the console, does this mean that the win_unicode_console module
will no longer be needed when these changes go in?

Sorry, not much in the way of direct experience or information I can
add, but a strong +1 on the change (and I'd be happy to help where
needed).

Paul

Re: [Python-ideas] Fix default encodings on Windows

Paul Moore