On Wed, May 1, 2013 at 5:32 PM, Pieter Nagel
Isn't the whole notion that stat() need to be cached for performance issues somewhat of a historical relic of older OS's and filesystem performance? AFAIK linux already has stat() caching as a side-effect of the filesystem layer's metadata caching. How does Windows and Mac OS fare here? Are there benchmarks proving that this is serious enough to complicate the API?
System calls typically release the GIL in threaded code (due to the possibility the underlying filesystem may be a network mount), which ends up being painfully expensive. The repeated stat calls also appear to be one of the main reasons walkdir is so much slower than performing the same operations in a loop rather than using a generator pipeline as walkdir does (see http://walkdir.readthedocs.org), although I admit it was a year or two ago I made those comparisons, and it wasn't the most scientific of benchmarking efforts. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia