[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()
Andrew Barnert
abarnert at yahoo.com
Thu Nov 15 16:36:34 CET 2012
From: Mike Meyer <mwm at mired.org>
Sent: Thu, November 15, 2012 2:29:44 AM
>If the goal is to make os.walk fast, then it might be better (on Posix systems,
>anyway) to see if it can be built on top of ftw instead of low-level directory
>scanning routines.
You can't actually use ftw, because it doesn't give any way to handle the
options to os.walk. Plus, it's "obsolescent" (POSIX2008), "deprecated" (linux),
or "legacy" (OS X), and at least some platforms will throw warnings when you
call it. It's also somewhat underspecified, and different platforms, even
different versions of the same platform, will give you different behavior in
various cases (especially with symlinks).
But you could, e.g., use fts on platforms that have it, nftw on platforms that
have a version close enough to recent linux/glibc for our purposes, and fall
back to readdir+stat for the rest. That could give a big speed improvement on
the most popular platforms, and on the others, at least things would be no worse
than today (and anyone who cared could much more easily write the appropriate
nftw/fts/whatever port for their platform once the framework was in place).
More information about the Python-ideas
mailing list