[Python-Dev] Windows: Remove support of bytes filenames in the os module?
victor.stinner at gmail.com
Mon Feb 8 09:32:00 EST 2016
Since 3.3, functions of the os module started to emit
DeprecationWarning when called with bytes filenames.
The rationale is quite simple: Windows native type for filenames is
Unicode, and the Windows has a weird behaviour when you use bytes. For
example, os.listdir(b'.') gives you paths which cannot be used with
open() on filenames which are not encodable the ANSI code page.
Unencodable characters are replaced with "?". The following issue was
opened to document this weird behaviour (but the doc was never
"Document that bytes OS API can returns unusable results on Windows"
When the new os.scandir() API was designed, I asked to *not* support
bytes filenames since they are "broken by design".
Recently, an user complained that os.walk() doesn't work with bytes on
"Regression: os.walk now using os.scandir() breaks bytes filenames on windows"
Serhiy Storchaka just pushed a change to reintroduce support bytes
support on Windows in os.walk(), but I would prefer to do the
*opposite*: drop supports for bytes filenames on Windows.
Are we brave enough to force users to use the "right" type for filenames?
On Python 2, it wasn't possible to use Unicode for filenames, many
functions fail badly with Unicode, especially when you mix bytes and
On Python 3, Unicode is the "natural" types, most Python functions
prefer Unicode, and the PEP 383 (surrogateescape) allows to safetely
use Unicode on UNIX even with undecodable filenames (invalid bytes are
stored as Unicode surrogate characters).
More information about the Python-Dev