[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()

Antoine Pitrou solipsis at pitrou.net
Wed Nov 14 11:52:04 CET 2012


Le Wed, 14 Nov 2012 22:53:44 +1300,
Robert Collins
<robertc at robertcollins.net> a écrit :
> 
> Data from bzr:
>  you can get a very significant speed up by doing two things:
>  - use readdir to get the inode numbers of the files in the directory
> and stat the files in-increasing-number-order. (this gives you
> monotonically increasing IO).

This assumes directory entries are sorted by inode number (in a btree,
I imagine). Is this assumption specific to some Linux / Ubuntu
filesystem?

>  - chdir to the directory before you stat and use a relative path: it
> turns out when working with many files that the overhead of absolute
> paths is substantial.

How about using fstatat() instead? chdir() is a no-no since it's
a process-wide setting.

Regards

Antoine.





More information about the Python-ideas mailing list