[Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8

Guido van Rossum guido at python.org
Wed Sep 7 13:37:01 EDT 2016


I'm hijacking this thread to provisionally accept PEP 529. (I'll also
do this for PEP 528, in its own thread.)

I've talked things over with Steve and Victor and we're going to do an
experiment (as now written up in the PEP:
https://www.python.org/dev/peps/pep-0529/#beta-experiment) to tease
out any issues with this change during the beta. If serious problems
crop up we may have to roll back the changes and reject the PEP -- we
won't get another chance at getting this right. (That would also mean
that using the binary filesystem APIs will remain deprecated and will
eventually be disallowed; as long as the PEP remains accepted they are
undeprecated.)

Congrats Steve! Thanks for the massive amount of work on the
implementation and the thinking that went into the design. Thanks
everyone else for their feedback.

--Guido

PS. I have one small inline response to Nick below.

On Sun, Sep 4, 2016 at 11:58 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 5 September 2016 at 15:59, Steve Dower <steve.dower at python.org> wrote:
>> +continue to default to ``locale.getpreferredencoding()`` (for text files) or
>> +plain bytes (for binary files). This only affects the encoding used when users
>> +pass a bytes object to Python where it is then passed to the operating system as
>> +a path name.
>
> For the three non-filesystem cases:
>
> I checked the situation for os.environb, and that's already
> unavailable on Windows (since os.supports_bytes_environ is False
> there), while sys.argv is apparently already handled correctly (i.e.
> always using the *W APIs).
>
> That means my only open question would be the handling of subprocess
> module calls (both with and without shell=True), since that currently
> works with binary arguments on *nix:
>
>>>> subprocess.call([b"python", b"-c", "print('ℙƴ☂ℌøἤ')".encode("utf-8")])
> ℙƴ☂ℌøἤ
> 0
>>>> subprocess.call(b"python -c '%s'" % 'print("ℙƴ☂ℌøἤ")'.encode("utf-8"), shell=True)
> ℙƴ☂ℌøἤ
> 0
>
> While calling system native apps that way will still have many
> portability challenges, there are also plenty of cases where folks use
> sys.executable to launch new Python processes in a separate instance
> of the currently running interpreter, and it would be good if these
> changes brought cross-platform consistency to the handling of binary
> arguments here as well.

I checked with Steve and this is not supported anyway -- bytes
arguments (regardless of the value of shell) fail early with a
TypeError. That may be a bug but there's no backwards compatibility to
preserve here. (And apart from Python, few shell commands that work on
Unix make much sense on Windows, so Im also not particularly worried
about that particular example being non-portable -- it doesn't
represent a realistic concern.)

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list