[Python-Dev] os.walk() is going to be *fast* with scandir

R. David Murray rdmurray at bitdance.com
Sun Aug 10 15:55:40 CEST 2014


On Sun, 10 Aug 2014 13:57:36 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 10 August 2014 13:20, Antoine Pitrou <antoine at python.org> wrote:
> > Le 09/08/2014 12:43, Ben Hoyt a écrit :
> >
> >> Just thought I'd share some of my excitement about how fast the all-C
> >> version [1] of os.scandir() is turning out to be.
> >>
> >> Below are the results of my scandir / walk benchmark run with three
> >> different versions. I'm using an SSD, which seems to make it
> >> especially faster than listdir / walk. Note that benchmark results can
> >> vary a lot, depending on operating system, file system, hard drive
> >> type, and the OS's caching state.
> >>
> >> Anyway, os.walk() can be FIFTY times as fast using os.scandir().
> >
> >
> > Very nice results, thank you :-)
> 
> Indeed!
> 
> This may actually motivate me to start working on a redesign of
> walkdir at some point, with scandir and DirEntry objects as the basis.
> My original approach was just too slow to be useful in practice (at
> least when working with trees on the scale of a full Fedora or RHEL
> build hosted on an NFS share).

There is another potentially good place in the stdlib to apply scandir:
iglob.  See issue 22167.

--David


More information about the Python-Dev mailing list