[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info

Nick Coghlan ncoghlan at gmail.com
Sat May 11 16:34:14 CEST 2013


On Sat, May 11, 2013 at 2:24 PM, Ben Hoyt <benhoyt at gmail.com> wrote:
> In all the *practical* examples I've seen (and written myself), I
> iterate over a directory and I just need to know whether it's a file
> or directory (or maybe a link). Occassionally you need the size as
> well, but that would just mean a similar check "if st.st_size is None:
> st = os.stat(...)", which on Linux/OS X would call stat(), but it'd
> still be free and fast on Windows.

Here's the full set of fields on a current stat object:

st_atime
st_atime_ns
st_blksize
st_blocks
st_ctime
st_ctime_ns
st_dev
st_gid
st_ino
st_mode
st_mtime
st_mtime_ns
st_nlink
st_rdev
st_size
st_uid

Do we really want to publish an object with all of those as attributes
potentially set to None, when the abstraction we're trying to present
is intended primarily for the benefit of os.walk?

And if we're creating a custom object instead, why return a 2-tuple
rather than making the entry's name an attribute of the custom object?

To me, that suggests a more reasonable API for os.scandir() might be
for it to be an iterator over "dir_entry" objects:

    name (as a string)
    is_file()
    is_dir()
    is_link()
    stat()
    cached_stat (None or a stat object)

On all platforms, the query methods would not require a separate
stat() call. On Windows, cached_stat would be populated with a full
stat object when scandir builds the entry. On non-Windows platforms,
cached_stat would initially be None, and you would have to call stat()
to populate it.

If we find other details that we can reliably provide cross-platform
from the dir information, then we can add more query methods or
attributes to the dir_entry object.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list