[Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

Ben Hoyt benhoyt at gmail.com
Mon Jul 14 14:27:39 CEST 2014


First, just to clarify a couple of points.

> You forgot one of my argument: we must have exactly the same API than
> os.path.is_dir() and pathlib.Path.is_dir(), because it would be very
> confusing (source of bugs) to have a different behaviour.

Actually, I specifically included that argument. It's item (b) in the
list in my original message yesterday. :-)

> Since these functions don't have any parameter (there is no such
> follow_symlink(s) parameter), I'm opposed to the idea of adding such
> parameter.
>
> If you really want to add a follow_symlink optional parameter, IMO you
> should modify all os.path.is*() functions and all pathlib.Path.is*()
> methods to add it there too. Maybe if nobody asked for this feature
> before, it's because it's not useful in practice. You can simply test
> explicitly is_symlink() before checking is_dir().

Yeah, this is fair enough.

> Well, let's imagine DirEntry.is_dir() does not follow symlinks. How do
> you test is_dir() and follow symlinks?
> "stat.S_ISDIR(entry.stat().st_mode)" ? You have to import the stat
> module, and use the ugly C macro S_ISDIR().

No, you don't actually need stat/S_ISDIR in that case -- if
DirEntry.is_dir() does not follow symlinks, you just say:

entry.is_symlink() and os.path.isdir(entry.full_name)

Or for the full test:

(entry.is_symlink() and os.path.isdir(entry.full_name)) or entry.is_dir()

On the other hand, if DirEntry.is_dir() does follow symlinks per your
proposal, then to do is_dir without following symlinks you need to use
DirEntry. lstat() like so:

stat.S_ISDIR(entry.lstat().st_mode)

So from this perspective it's somewhat nicer to have DirEntry.is_X()
not follow links and use DirEntry.is_symlink() and os.path.isX() to
supplement that if you want to follow links.

I think Victor has a good point re 92% of the stdlib calls that use
listdir and isX do follow links.

However, I think Tim Delaney makes some good points above about the
(not so) safety of scandir following symlinks by default -- symlinks
to network file systems, nonexist files, or huge directory trees. In
that light, this kind of thing should be opt-*in*.

I guess I'm still slightly on the DirEntry-does-not-follow-links side
of the fence, due to the fact that it's a method on the *directory
entry* object, due to simplicity of implementation, and due to Tim
Delaney's "it should be safe by default" point above.

However, we're *almost* bikeshedding at this point, and I think we
just need to pick one way or the other. It's straight forward to
implement one in terms of the other in each case.

-Ben


More information about the Python-Dev mailing list