[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info

Christian Heimes christian at python.org
Mon May 13 01:28:33 CEST 2013


Am 13.05.2013 00:04, schrieb Ben Hoyt:
> In fact, I don't think .cached_lstat should be exposed to the user.
> They just call entry.lstat(), and it returns a cached stat or calls
> os.lstat() to get the real stat if required (and populates the
> internal cached stat value). And the entry.is* functions would call
> entry.lstat() if dirent was or d_type was DT_UNKNOWN. This would
> change relatively nasty code like this:

I would prefer to go the other route and don't expose lstat(). It's
cleaner and less confusing to have a property cached_lstat on the object
because it actually says what it contains. The property's internal code
can do a lstat() call if necessary.

Your code example doesn't handle the case of a failing lstat() call. It
can happen when the file is removed or permission of a parent directory
changes.

> This change would make scandir() usable by ordinary mortals, rather
> than just hardcore library implementors.

Why not have both? The os module exposes and leaks the platform details
on more than on occasion. A low level function can expose name + dirent
struct on POSIX and name + stat_result on Windows. Then you can build a
high level API like os.scandir() in pure Python code.

> class DirEntry:
>     def __init__(self, name, dirent, lstat, path='.'):
>         # User shouldn't need to call this, but called internally by scandir()
>         self.name = name
>         self.dirent = dirent
>         self._lstat = lstat  # non-public attributes
>         self._path = path

You should include the fd of the DIR pointer here for the new *at()
function family.

>     def lstat(self):
>         if self._lstat is None:
>             self._lstat = os.lstat(os.path.join(self._path, self.name))
>         return self._lstat

The function should use fstatat(2) function (os.lstat with dir_fd) when
it is available on the current platform. It's better and more secure
than lstat() with a joined path.

>     def isdir(self):
>         if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
>             return self.dirent.d_type == DT_DIR
>         else:
>             return stat.S_ISDIR(self.lstat().st_mode)
> 
>     def isfile(self):
>         if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
>             return self.dirent.d_type == DT_REG
>         else:
>             return stat.S_ISREG(self.lstat().st_mode)
> 
>     def islink(self):
>         if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
>             return self.dirent.d_type == DT_LNK
>         else:
>             return stat.S_ISLNK(self.lstat().st_mode)

A bit faster:

    d_type = getattr(self.dirent, "d_type", DT_UNKNOWN)
    if d_type != DT_UNKNOWN:
        return d_type == DT_LNK

The code doesn't handle a failing lstat() call.

Christian



More information about the Python-Dev mailing list