[Python-Dev] My summary of the scandir (PEP 471)

Victor Stinner victor.stinner at gmail.com
Tue Jul 1 16:28:10 CEST 2014


2014-07-01 15:00 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> (a) it doesn't call stat for you (on POSIX), so you have to check an
> attribute and call scandir manually if you need it,

Yes, and that's something common when you use the os module. For
example, don't try to call os.fork(), os.getgid() or os.fchmod() on
Windows :-) Closer to your PEP, the following OS attributes are only
available on UNIX: st_blocks, st_blksize, st_rdev, st_flags; and
st_file_attributes is only available on Windows.

I don't think that using lstat_result is a common need when browsing a
directoy. In most cases, you only need is_dir() and the name
attribute.

> 1) the original proposal in the current version of PEP 471, where
> DirEntry has an .lstat() method which calls stat() on POSIX but is
> free on Windows

On UNIX, does it mean that .lstat() calls os.lstat() at the first
call, and then always return the same result? It would be different
than os.lstat() and pathlib.Path.stat() :-( I would prefer to have the
same behaviour than pathlib and os (you know, the well known
consistency of Python stdlib). As I wrote, I expect a function call to
always retrieve the new status.

> 2) Nick Coghlan's proposal on the previous thread
> (https://mail.python.org/pipermail/python-dev/2014-June/135261.html)
> suggesting an ensure_lstat keyword param to scandir if you need the
> lstat_result value

I don't like this idea because it makes error handling more complex.
The syntax to catch exceptions on an iterator is verbose (while: try:
next() except ...).

Whereas calling os.lstat(entry.fullname()) is explicit and it's easy
to surround it with try/except.


> .lstat_result being None sometimes (on POSIX),

Don't do that, it's not how Python handles portability. We use hasattr().


> would it ever really happen that readdir() would succeed but an os.stat() immediately after would fail?

Yes, it can happen. The filesystem is system-wide and shared by all
users. The file can be deleted.


> Really, are bytes filenames deprecated?

Yes, in all functions of the os module since Python 3.3. I'm sure
because I implemented the deprecation :-)

Try open(b'test.txt', w') on Windows with python -Werror.


> I think maybe they should be, as they don't work on Windows :-)

Windows has an API dedicated to bytes filenames, the ANSI API. But
this API has annoying bugs: it replaces unencodable characters by
question marks, and there is no option to be noticed on the encoding
error.

Different users complained about that. It was decided to not change
Python since Python is a light wrapper over the kernel system calls.
But bytes filenames are now deprecated to advice users to use the
native type for filenames on Windows: Unicode!


> but the latest Python "os" docs
> (https://docs.python.org/3.5/library/os.html) still say that all
> functions that accept path names accept either str or bytes,

Maybe I forgot to update the documentation :-(


> So I think scandir() should do the same thing.

You may support scandir(bytes) on Windows but you will need to emit a
deprecation warning too. (which are silent by default.)

Victor


More information about the Python-Dev mailing list