[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()

Mike Meyer mwm at mired.org
Fri Nov 16 00:47:06 CET 2012

On Nov 15, 2012 2:06 PM, "Ben Hoyt" <benhoyt at gmail.com> wrote:
> >> """Yield tuples of (filename, stat_result) for each filename in
> >> directory given by "path". Like listdir(), '.' and '..' are skipped.
> >> The values are yielded in system-dependent order.
> >>
> >> Each stat_result is an object like you'd get by calling os.stat() on
> >> that file, but not all information is present on all systems, and st_*
> >> fields that are not available will be None.
> >>
> >> In practice, stat_result is a full os.stat() on Windows, but only the
> >> "is type" bits of the st_mode field are available on Linux/OS X/BSD.
> >> """
> >
> > There's a code smell here, in that the doc for Unix variants is
> > and wrong. Whether or not you get the d_type values depends on the OS
> > that extension. Further, there's a d_type value (DT_UNKNOWN) that isn't
> > valid value for the S_IFMT bits in st_mode (at least on BSD).
> Not sure I understand why the docstring above is incomplete/wrong.

It's incomplete because it doesn't say  what happens on other Posix
systems. It's wrong because it implies that the type bits of st_mode are
always available, when that's not the case.

Better would be 'on Posix systems, if st_mode is not None only the type
bits are valid.' Assuming that the underlying code translates DT_UNKNOWN to
binding st_mode to None.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121115/502401fc/attachment.html>

More information about the Python-ideas mailing list