
On 15 August 2016 at 19:26, Steve Dower <steve.dower@python.org> wrote:
Passing path_as_bytes in that location has been deprecated since 3.3, so we are well within our rights (and probably overdue) to make it a TypeError in 3.6. While it's obviously an invalid assumption, for the purposes of changing the language we can assume that no existing code is passing bytes into any functions where it has been deprecated.
As far as I'm concerned, there are currently no filesystem APIs on Windows that accept paths as bytes.
[...] On 16 August 2016 at 03:00, Nick Coghlan <ncoghlan@gmail.com> wrote:
The problem is that bytes-as-paths actually *does* work for Mac OS X and systemd based Linux distros properly configured to use UTF-8 for OS interactions. This means that a lot of backend network service code makes that assumption, especially when it was originally written for Python 2, and rather than making it work properly on Windows, folks just drop Windows support as part of migrating to Python 3.
At an ecosystem level, that means we're faced with a choice between implicitly encouraging folks to make their code *nix only, and finding a way to provide a more *nix like experience when running on Windows (where UTF-8 encoded binary data just works, and either other encodings lead to mojibake or else you use chardet to figure things out).
Steve is suggesting that the latter option is preferable, a view I agree with since it lowers barriers to entry for Windows based developers to contribute to primarily *nix focused projects.
So does this mean that you're recommending reverting the deprecation of bytes as paths in favour of documenting that bytes as paths is acceptable, but it will require an encoding of UTF-8 rather than the current behaviour? If so, that raises some questions: 1. Is it OK to backtrack on a deprecation by changing the behaviour like this? (I think it is, but others who rely on the current, deprecated, behaviour may not). 2. Should we be making "always UTF-8" the behaviour on all platforms, rather than just Windows (e.g., Unix systems which haven't got UTF-8 as their locale setting)? This doesn't seem to be a Windows-specific question any more (I'm assuming that if bytes-as-paths are deprecated, that's a cross-platform change, but see below). Having said all this, I can't find the documentation stating that bytes paths are deprecated - the open() documentation for 3.5 says "file is either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped" and there's no mention of a deprecation. Steve - could you provide a reference? Paul