Is there an alternative to os.walk?
Bruce
epost2 at gmail.com
Sat Oct 7 13:34:10 EDT 2006
waylan wrote:
> Bruce wrote:
> > Hi all,
> > I have a question about traversing file systems, and could use some
> > help. Because of directories with many files in them, os.walk appears
> > to be rather slow. I`m thinking there is a potential for speed-up since
> > I don`t need os.walk to report filenames of all the files in every
> > directory it visits. Is there some clever way to use os.walk or another
> > tool that would provide functionality like os.walk except for the
> > listing of the filenames?
>
> You might want to check out the path module [1] (not os.path). The
> following is from the docs:
>
> > The method path.walk() returns an iterator which steps recursively
> > through a whole directory tree. path.walkdirs() and path.walkfiles()
> > are the same, but they yield only the directories and only the files,
> > respectively.
>
> Oh, and you can thank Paul Bissex for pointing me to path [2].
>
> [1]: http://www.jorendorff.com/articles/python/path/
> [2]: http://e-scribe.com/news/289
A little late but.. thanks for the replies, was very useful. Here`s
what I do in this case:
def search(a_dir):
valid_dirs = []
walker = os.walk(a_dir)
while 1:
try:
dirpath, dirnames, filenames = walker.next()
except StopIteration:
break
if dirtest(dirpath,filenames):
valid_dirs.append(dirpath)
return valid_dirs
def dirtest(a_dir):
testfiles = ['a','b','c']
for f in testfiles:
if not os.path.exists(os.path.join(a_dir,f)):
return 0
return 1
I think you`re right - it`s not os.walk that makes this slow, it`s the
dirtest method that takes so much more time when there are many files
in a directory. Also, thanks for pointing me to the path module, was
interesting.
More information about the Python-list
mailing list