[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info
Gregory P. Smith
greg at krypto.org
Tue May 14 08:39:21 CEST 2013
On Sun, May 12, 2013 at 3:04 PM, Ben Hoyt <benhoyt at gmail.com> wrote:
> > And if we're creating a custom object instead, why return a 2-tuple
> > rather than making the entry's name an attribute of the custom object?
> >
> > To me, that suggests a more reasonable API for os.scandir() might be
> > for it to be an iterator over "dir_entry" objects:
> >
> > name (as a string)
> > is_file()
> > is_dir()
> > is_link()
> > stat()
> > cached_stat (None or a stat object)
>
> Nice! I really like your basic idea of returning a custom object
> instead of a 2-tuple. And I agree with Christian that .stat() would be
> clearer called .lstat(). I also like your later idea of simply
> exposing .dirent (would be None on Windows).
>
> One tweak I'd suggest is that is_file() etc be called isfile() etc
> without the underscore, to match the naming of the os.path.is*
> functions.
>
> > That would actually make sense at an implementation
> > level anyway - is_file() etc would check self.cached_lstat first, and
> > if that was None they would check self.dirent, and if that was also
> > None they would raise an error.
>
> Hmm, I'm not sure about this at all. Are you suggesting that the
> DirEntry object's is* functions would raise an error if both
> cached_lstat and dirent were None? Wouldn't it make for a much simpler
> API to just call os.lstat() and populate cached_lstat instead? As far
> as I'm concerned, that'd be the point of making DirEntry.lstat() a
> function.
>
> In fact, I don't think .cached_lstat should be exposed to the user.
> They just call entry.lstat(), and it returns a cached stat or calls
> os.lstat() to get the real stat if required (and populates the
> internal cached stat value). And the entry.is* functions would call
> entry.lstat() if dirent was or d_type was DT_UNKNOWN. This would
> change relatively nasty code like this:
>
> files = []
> dirs = []
> for entry in os.scandir(path):
> try:
> isdir = entry.isdir()
> except NotPresentError:
> st = os.lstat(os.path.join(path, entry.name))
> isdir = stat.S_ISDIR(st)
> if isdir:
> dirs.append(entry.name)
> else:
> files.append(entry.name)
>
> Into nice clean code like this:
>
> files = []
> dirs = []
> for entry in os.scandir(path):
> if entry.isfile():
> dirs.append(entry.name)
> else:
> files.append(entry.name)
>
> This change would make scandir() usable by ordinary mortals, rather
> than just hardcore library implementors.
>
> In other words, I'm proposing that the DirEntry objects yielded by
> scandir() would have .name and .dirent attributes, and .isdir(),
> .isfile(), .islink(), .lstat() methods, and look basically like this
> (though presumably implemented in C):
>
> class DirEntry:
> def __init__(self, name, dirent, lstat, path='.'):
> # User shouldn't need to call this, but called internally by
> scandir()
> self.name = name
> self.dirent = dirent
> self._lstat = lstat # non-public attributes
> self._path = path
>
> def lstat(self):
> if self._lstat is None:
> self._lstat = os.lstat(os.path.join(self._path, self.name))
> return self._lstat
>
> def isdir(self):
> if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
> return self.dirent.d_type == DT_DIR
> else:
> return stat.S_ISDIR(self.lstat().st_mode)
>
> def isfile(self):
> if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
> return self.dirent.d_type == DT_REG
> else:
> return stat.S_ISREG(self.lstat().st_mode)
>
> def islink(self):
> if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN:
> return self.dirent.d_type == DT_LNK
> else:
> return stat.S_ISLNK(self.lstat().st_mode)
>
> Oh, and the .dirent would either be None (Windows) or would have
> .d_type and .d_ino attributes (Linux, OS X).
>
> This would make the scandir() API nice and simple to use for callers,
> but still expose all the information the OS provides (both the
> meaningful fields in dirent, and a full stat on Windows, nicely cached
> in the DirEntry object).
>
> Thoughts?
>
I like the sound of this (which sounds like what you've implemented now
though I haven't looked at your code).
-gps
>
> -Ben
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130513/cebbba6a/attachment.html>
More information about the Python-Dev
mailing list