[Python-Dev] os.path.walk() lacks 'depth first' option

Guido van Rossum guido@python.org
Mon, 21 Apr 2003 15:30:29 -0400

> Here's that again, with the bug repaired, sped up some, and with a
> docstring.  Double duty: the example in the docstring shows why we
> don't want to make a special case out of sum([]): empty lists can
> arise naturally.
> What else would people like in this?  I really like separating the
> directory names from the plain-file names, so don't bother griping
> about that <wink>.

Good enough for me. :-)

> It's at least as fast as the current os.path.walk() (it's generally
> faster for me, but times for this are extremely variable on Win98).
> Removing the internal recursion doesn't appear to make a measureable
> difference when walking my Python tree, although because recursive
> generators require time proportional to the current stack depth to
> deliver a result to the caller, and to resume again, removing
> recursion could be much more efficient on an extremely deep tree.
> The biggest speedup I could find on Windows was via using os.chdir()
> liberally, so that os.path.join() calls weren't needed, and
> os.path.isdir() calls worked directly on one-component names.  I
> suspect this has to do with that Win98 doesn't have an effective way
> to cache directory lookups under the covers.  Even so, it only
> amounted to a 10% speedup: directory walking is plain slow on Win98
> no matter how you do it.  The attached doesn't play any gross speed
> tricks.

Please don't us chdir(), no matter how much it speeds things up.  It's
a disaster in a multi-threaded program.

--Guido van Rossum (home page: http://www.python.org/~guido/)