[Python-ideas] BetterWalk, a better and faster os.walk() for Python
Andrew Barnert
abarnert at yahoo.com
Fri Nov 23 05:42:28 CET 2012
From: Ben Hoyt <benhoyt at gmail.com>
Sent: Thu, November 22, 2012 2:44:49 PM
> In the meantime, anyone who wants to comment on the iterdir_stat() API
> or other issues, go ahead!
I already mentioned the problem following symlinks into nonexistent paths.
The followlinks implementation is os.walk seems wrong. If need_stat is false,
iterdir_stat will return S_IFLNK, but then os.walk only checks for S_IFDIR, so
it won't recurse into them. Plus, it looks like, even if you got that right,
you'd be trying to opendir every symlink, even the ones you know aren't links to
directories.
On a related note, if you call iterdir_stat with just None or st_mode_type,
symlinks will show up as links, but if you call with anything else, they'll show
up as the referenced file. I think you really want a "physical" flag to control
whether you call stat or lstat, although I'm not sure what should happen for the
no-stat version in that case. (Maybe look at what fts and nftw do?)
It might also be handy to be able to not call stat on directories, so if you
wanted a "iterwalk_stat", it could just call fstat(dirfd(d)) after opendir (as
nftw does).
Your code assumes that all paths are UTF-8. That's not guaranteed for linux or
FreeBSD (although it is for OS X); you want sys.getfilesystemencoding().
Windows wildcards and fnmatch are not the same, and your check for '[' in
pattern or pattern.endswith('?') is not sufficient to distinguish between the
two.
The docstring should mention that fields=None returns nothing for free on other
platforms.
The docstring (and likewise the comments) refers to "BSD" in general, but what
you actually check for is "freebsd". I think OpenBSD, NetBSD, etc. will work
with the BSD code; if not, the docs shouldn't imply that they do.
I believe cygwin can also use the BSD code. (I know they use FreeBSD's fts.c
unmodified.)
More information about the Python-ideas
mailing list