[Python-ideas] Fix default encodings on Windows
Steve Dower
steve.dower at python.org
Tue Aug 16 09:59:00 EDT 2016
Hmm, doesn't seem to be explicitly listed as a deprecation, though discussion form around that time makes it clear that everyone thought it was.
I also found this proposal to use strict mbcs to decode bytes for use against the file system, which is basically the same as what I'm proposing now apart from the more limited encoding: https://mail.python.org/pipermail/python-dev/2011-October/114203.html
It definitely results in less C code to maintain if we do the decode ourselves. We could use strict mbcs, but I'd leave the deprecation warnings in there. Or perhaps we provide an env var to use mbcs as the file system encoding but default to utf8 (I already have one for selecting legacy console encoding)? Callers should be asking the sys module for the encoding anyway, so I'd expect few libraries to be impacted, though applications might prefer it.
Top-posted from my Windows Phone
-----Original Message-----
From: "Paul Moore" <p.f.moore at gmail.com>
Sent: 8/16/2016 3:54
To: "Nick Coghlan" <ncoghlan at gmail.com>
Cc: "python-ideas" <python-ideas at python.org>
Subject: Re: [Python-ideas] Fix default encodings on Windows
On 15 August 2016 at 19:26, Steve Dower <steve.dower at python.org> wrote:
> Passing path_as_bytes in that location has been deprecated since 3.3, so we
> are well within our rights (and probably overdue) to make it a TypeError in
> 3.6. While it's obviously an invalid assumption, for the purposes of
> changing the language we can assume that no existing code is passing bytes
> into any functions where it has been deprecated.
>
> As far as I'm concerned, there are currently no filesystem APIs on Windows
> that accept paths as bytes.
[...]
On 16 August 2016 at 03:00, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The problem is that bytes-as-paths actually *does* work for Mac OS X
> and systemd based Linux distros properly configured to use UTF-8 for
> OS interactions. This means that a lot of backend network service code
> makes that assumption, especially when it was originally written for
> Python 2, and rather than making it work properly on Windows, folks
> just drop Windows support as part of migrating to Python 3.
>
> At an ecosystem level, that means we're faced with a choice between
> implicitly encouraging folks to make their code *nix only, and finding
> a way to provide a more *nix like experience when running on Windows
> (where UTF-8 encoded binary data just works, and either other
> encodings lead to mojibake or else you use chardet to figure things
> out).
>
> Steve is suggesting that the latter option is preferable, a view I
> agree with since it lowers barriers to entry for Windows based
> developers to contribute to primarily *nix focused projects.
So does this mean that you're recommending reverting the deprecation
of bytes as paths in favour of documenting that bytes as paths is
acceptable, but it will require an encoding of UTF-8 rather than the
current behaviour? If so, that raises some questions:
1. Is it OK to backtrack on a deprecation by changing the behaviour
like this? (I think it is, but others who rely on the current,
deprecated, behaviour may not).
2. Should we be making "always UTF-8" the behaviour on all platforms,
rather than just Windows (e.g., Unix systems which haven't got UTF-8
as their locale setting)? This doesn't seem to be a Windows-specific
question any more (I'm assuming that if bytes-as-paths are deprecated,
that's a cross-platform change, but see below).
Having said all this, I can't find the documentation stating that
bytes paths are deprecated - the open() documentation for 3.5 says
"file is either a string or bytes object giving the pathname (absolute
or relative to the current working directory) of the file to be opened
or an integer file descriptor of the file to be wrapped" and there's
no mention of a deprecation. Steve - could you provide a reference?
Paul
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160816/fd9eca7c/attachment.html>
More information about the Python-ideas
mailing list