[Python-Dev] File system path encoding on Windows

Nick Coghlan ncoghlan at gmail.com
Sat Aug 20 15:31:27 EDT 2016


On 20 August 2016 at 04:59, Steve Dower <steve.dower at python.org> wrote:
> Questions:
> * should we always use Window's Unicode APIs instead of switching between
> bytes/Unicode based on parameter type?

Yes

> * should we allow users to pass bytes and interpret them as utf-8 rather
> than letting Windows do the decoding?

Yes (eventually)

> * should we do it in 3.6, 3.7 or 3.8?

Reading your summary meant this finally clicked with something Victor
has been considering for a while: a "Force UTF-8" switch that told
Python to ignore the locale encoding on Linux, and instead assume
UTF-8 everywhere (command line parameter parsing, environment variable
processing, filesystem encoding, standard streams, etc)

It's essentially the same problem you have on Windows, just with
slightly different symptoms and consequences.

Prompted by that realisation, I'd like to suggest an option that
didn't come up on python-ideas: add such a flag to Python 3.6, and
then actively seek feedback from folks using non-UTF-8 encodings
before making a decision on what to do by default in Python 3.7.

This is a really hard problem for people to reason about abstractly,
but "try running Python with this new flag, and see if anything
breaks" is a much easier question to ask and answer.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list