[Python-Dev] PEP 428: stat caching undesirable?

Nick Coghlan ncoghlan at gmail.com
Wed May 1 11:15:58 CEST 2013


On Wed, May 1, 2013 at 5:32 PM, Pieter Nagel <pieter at nagel.co.za> wrote:
> Isn't the whole notion that stat() need to be cached for performance
> issues somewhat of a historical relic of older OS's and filesystem
> performance? AFAIK linux already has stat() caching as a side-effect of
> the filesystem layer's metadata caching. How does Windows and Mac OS
> fare here? Are there benchmarks proving that this is serious enough to
> complicate the API?

System calls typically release the GIL in threaded code (due to the
possibility the underlying filesystem may be a network mount), which
ends up being painfully expensive.

The repeated stat calls also appear to be one of the main reasons
walkdir is so much slower than performing the same operations in a
loop rather than using a generator pipeline as walkdir does (see
http://walkdir.readthedocs.org), although I admit it was a year or two
ago I made those comparisons, and it wasn't the most scientific of
benchmarking efforts.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list