[Python-Dev] The future of the wchar_t cache

Victor Stinner vstinner at redhat.com
Mon Oct 22 09:13:12 EDT 2018

Le lun. 22 oct. 2018 à 15:08, Steve Dower <steve.dower at python.org> a écrit :
> Agreed the cache is useless here, but since the listdir() result came in
> as wchar_t we could keep it that way (assuming we'd only be changing it
> to char), and then there wouldn't have to be a conversion when we
> immediately pass it back to open().

Serhiy wants to remove the cache which should *reduce* Python memory
footprint on Windows.

You are proposing to fill the cache eagierly, that would increase the
Python memory footprint :-/ Your proposed change is an optimisation, a
benchmark is needed to see the benefit. I expect no significant
difference on benchmarks of https://pyperformance.readthedocs.io/ ...

> That said, I spent some time yesterday converting the importlib cache to
> use scandir and separate caches for dir/file (to avoid the stat calls)
> and it made very little overall difference. I have to assume the string
> manipulation dominates. (Making DirEntry lazily calculate its .path had
> a bigger impact. Also, I didn't try to make Windows flush its own stat
> cache, and accessing warm files is much faster than cold ones.)

I helped Ben Hoyt to design and implement his PEP 471 (os.scandir).
When the kernel filesystem cache is filled, the speedup of
os.scandir() is hard to notice. But when you work on a network
filesystem like NFS, the speedup is like 5x faster. NFS doesn't cache
stat() by default.


More information about the Python-Dev mailing list