[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()

Mike Meyer mwm at mired.org
Wed Nov 14 13:16:13 CET 2012


On Nov 14, 2012 4:55 AM, "Antoine Pitrou" <solipsis at pitrou.net> wrote:
>
> Le Wed, 14 Nov 2012 22:53:44 +1300,
> Robert Collins
> <robertc at robertcollins.net> a écrit :
> >
> > Data from bzr:
> >  you can get a very significant speed up by doing two things:
> >  - use readdir to get the inode numbers of the files in the directory
> > and stat the files in-increasing-number-order. (this gives you
> > monotonically increasing IO).
>
> This assumes directory entries are sorted by inode number (in a btree,
> I imagine). Is this assumption specific to some Linux / Ubuntu
> filesystem?

It doesn't assume that, because inodes aren't stored in directories on
Posix file systems. Instead, they have names & inode numbers . The inode
(which is where the data stat returns lives) is stored elsewhere in the
file system, typically in on-disk arrays indexed  by inode number (and
that's grossly oversimplified).

I'm not sure how this would work on modern file systems (zfs, btrfs).

     <mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121114/1d0cb039/attachment.html>


More information about the Python-ideas mailing list