[Python-ideas] find-like functionality in pathlib

Guido van Rossum guido at python.org
Wed Jan 6 17:42:35 EST 2016


I couldn't help myself and coded up a prototype for the StatCache design I
sketched. See http://bugs.python.org/issue26031. Feedback welcome! On my
Mac it only seems to offer limited benefits though...

On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum <guido at python.org> wrote:

> On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at fastmail.com> wrote:
>
>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
>> > One problem with stat() caching is that Path objects are considered
>> > immutable, and two Path objects referring to the same path are
>> completely
>> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')}
>> is
>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
>> > cache_stat=True), the behavior of two instances of that object might be
>> > observably different (if they were instantiated at times when the
>> > contents of the filesystem was different). So maybe stat-caching Path
>> instances
>> > should be considered unequal, or perhaps unhashable. Or perhaps they
>> > should only be considered equal if their stat() values are actually
>> equal (i.e.
>> > if the file's stat() info didn't change).
>>
>> What about a global cache?
>
>
> It would have to use a weak dict so if the last reference goes away it
> discards the cached stats for a given path, otherwise you'd have trouble
> containing the cache size.
>
> And caching Path objects should still not be comparable to non-caching
> Path objects (which we will need to preserve the semantics that repeatedly
> calling stat() on a Path object created the default way will always redo
> the syscall). The main advantage would be that caching Path objects could
> be compared safely.
>
> It could still cause unexpected results. E.g. if you have just traversed
> some big tree using caching, and saved some results (so hanging on to some
> paths and hence their stat() results), and then you make some changes and
> traverse it again to look for something else, you might accidentally be
> seeing stale (i.e. cached) stat() results.
>
> Maybe there's a middle ground, where the user can create a StatCache
> object and pass it into Path creation and traversal operations. Paths with
> the same StatCache object (or both None) compare equal if their path
> components are equal. Paths with different StatCache objects never compare
> equal (but otherwise are ordered by path as usual -- the StatCache object's
> identity is only used when the paths are equal.
>
> Are you (or anyone still reading this) interested in implementing this
> idea?
>
> --
> --Guido van Rossum (python.org/~guido)
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160106/0ca4ab15/attachment-0001.html>


More information about the Python-ideas mailing list