[New-bugs-announce] [issue26031] Add stat caching option to pathlib

Guido van Rossum report at bugs.python.org
Wed Jan 6 17:41:29 EST 2016


New submission from Guido van Rossum:

There are concerns that pathlib is inefficient because it doesn't cache stat() operations. Thus, for example this code calls stat() for each result twice (once internal to the glob, a second time to answer the is_symlink() question):

  p = pathlib.Path('/usr')
  links = [x for x in p.rglob('*') if x.is_symlink()]

I have a tentative patch (without tests). On my Mac it only gives modest speedups (between 5 and 20 percent) but things may be different on other platforms or for applications that make a lot of inquiries about the same path.

The API I am proposing is that by default nothing changes; to benefit from caching you must instantiate a StatCache() object and pass it to Path() constructor calls, e.g. Path('/usr', stat_cache=cache_object). All Path objects derived from this path object will share the cache. To force an uncached Path object you can use Path(p).

The patch is incomplete; there are no tests for the new functionality (though existing tests pass) and __eq__ should be adjusted so that Path objects using different caches always compare unequal.

Question for Antoine: Did you perhaps anticipate a design like this? Each Path instance has an _accessor slot, but there is only one accessor instance defined that is used everywhere (the global _normal_accessor). So you could have avoided a bunch of complexity in the code around setting the proper _accessor unless you were planning to use multiple accessors.

----------
files: statcache.diff
keywords: patch
messages: 257651
nosy: gvanrossum, pitrou
priority: normal
severity: normal
stage: test needed
status: open
title: Add stat caching option to pathlib
type: enhancement
versions: Python 3.6
Added file: http://bugs.python.org/file41521/statcache.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26031>
_______________________________________


More information about the New-bugs-announce mailing list