[Python-Dev] Windows: Remove support of bytes filenames in the os module?

Brett Cannon brett at python.org
Mon Feb 8 12:02:41 EST 2016


On Mon, 8 Feb 2016 at 06:33 Victor Stinner <victor.stinner at gmail.com> wrote:

> Hi,
>
> Since 3.3, functions of the os module started to emit
> DeprecationWarning when called with bytes filenames.
>
> The rationale is quite simple: Windows native type for filenames is
> Unicode, and the Windows has a weird behaviour when you use bytes. For
> example, os.listdir(b'.') gives you paths which cannot be used with
> open() on filenames which are not encodable the ANSI code page.
> Unencodable characters are replaced with "?". The following issue was
> opened to document this weird behaviour (but the doc was never
> completed):
>
> "Document that bytes OS API can returns unusable results on Windows"
> http://bugs.python.org/issue16700
>
>
> When the new os.scandir() API was designed, I asked to *not* support
> bytes filenames since they are "broken by design".
> https://www.python.org/dev/peps/pep-0471/
>
> Recently, an user complained that os.walk() doesn't work with bytes on
> Windows anymore:
>
> "Regression: os.walk now using os.scandir() breaks bytes filenames on
> windows"
> http://bugs.python.org/issue25911
>
>
> Serhiy Storchaka just pushed a change to reintroduce support bytes
> support on Windows in os.walk(), but I would prefer to do the
> *opposite*: drop supports for bytes filenames on Windows.
>
> Are we brave enough to force users to use the "right" type for filenames?
>
> --
>
> On Python 2, it wasn't possible to use Unicode for filenames, many
> functions fail badly with Unicode, especially when you mix bytes and
> Unicode.
>
> On Python 3, Unicode is the "natural" types, most Python functions
> prefer Unicode, and the PEP 383 (surrogateescape) allows to safetely
> use Unicode on UNIX even with undecodable filenames (invalid bytes are
> stored as Unicode surrogate characters).
>

If Unicode string don't work in Python 2 then what is Python 2/3 to do as a
cross-platform solution if we completely remove bytes support in Python 3?
Wouldn't that mean there is no common type between Python 2 & 3 that one
can use which will work with the os module except native strings (which are
difficult to get right)?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160208/c7e674dc/attachment.html>


More information about the Python-Dev mailing list