[Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

Steve Dower Steve.Dower at microsoft.com
Mon Jul 21 18:11:45 CEST 2014


Victor Stinner wrote:
> 2014-07-20 18:50 GMT+02:00 Antoine Pitrou <antoine at python.org>:
>> Have you tried modifying importlib's _bootstrap.py to use scandir() 
>> instead of listdir() + stat()?
>
> IMO the current os.scandir() API does not fit importlib requirements.
> importlib usually wants fresh data, whereas DirEntry cache cannot be
> invalidated. It's probably possible to cache some os.stat() result in
> importlib, but it looks like it requires a non trivial refactoring of
> the code. I don't know importlib enough to suggest how to change it.

The data is completely fresh at the time it is obtained, which is identical to using stat(). There will always be a race-condition between looking and doing, which is why we still use exception handling on actions.

> By the way, DirEntry constructor is not documented in the PEP. Should
> we document it? It might be a way to "invalidate the cache":
>
> entry = DirEntry(os.path.dirname(entry.path), entry.name)
>
> Maybe it is an abuse of the API. A clear_cache() method would be less
> ugly :-) But maybe Ben Hoyt does not want to promote keeping DirEntry
> for a long time?

DirEntry is a convenient way to return a tuple without returning a tuple, that's all. If you want up to date info, call os.stat() and pass in the path. This should just be a better (and ideally transparent) substitute for os.listdir() in every single context.

Personally I'd make it a string subclass and put one-shot properties on it (i.e. call/cache stat() on first access where we don't already know the answer), which I think is close enough to where it's landed that I'm happy. (As far as bikeshedding goes, I prefer "_DirEntry" and no docs :) )

Cheers,
Steve


More information about the Python-Dev mailing list