[Python-Dev] Windows: Remove support of bytes filenames in the os module?
Victor Stinner
victor.stinner at gmail.com
Mon Feb 8 09:32:00 EST 2016
Hi,
Since 3.3, functions of the os module started to emit
DeprecationWarning when called with bytes filenames.
The rationale is quite simple: Windows native type for filenames is
Unicode, and the Windows has a weird behaviour when you use bytes. For
example, os.listdir(b'.') gives you paths which cannot be used with
open() on filenames which are not encodable the ANSI code page.
Unencodable characters are replaced with "?". The following issue was
opened to document this weird behaviour (but the doc was never
completed):
"Document that bytes OS API can returns unusable results on Windows"
http://bugs.python.org/issue16700
When the new os.scandir() API was designed, I asked to *not* support
bytes filenames since they are "broken by design".
https://www.python.org/dev/peps/pep-0471/
Recently, an user complained that os.walk() doesn't work with bytes on
Windows anymore:
"Regression: os.walk now using os.scandir() breaks bytes filenames on windows"
http://bugs.python.org/issue25911
Serhiy Storchaka just pushed a change to reintroduce support bytes
support on Windows in os.walk(), but I would prefer to do the
*opposite*: drop supports for bytes filenames on Windows.
Are we brave enough to force users to use the "right" type for filenames?
--
On Python 2, it wasn't possible to use Unicode for filenames, many
functions fail badly with Unicode, especially when you mix bytes and
Unicode.
On Python 3, Unicode is the "natural" types, most Python functions
prefer Unicode, and the PEP 383 (surrogateescape) allows to safetely
use Unicode on UNIX even with undecodable filenames (invalid bytes are
stored as Unicode surrogate characters).
Victor
More information about the Python-Dev
mailing list