[Python-Dev] os.path.walk() lacks 'depth first' option

Tim Peters tim.one@comcast.net
Sun, 20 Apr 2003 22:12:42 -0400


[Guido]
> But if I had to do it over again, I wouldn't have added walk() in the
> current form.  I often find it harder to fit a particular program's
> needs in the API offered by walk() than it is to reimplement the walk
> myself.  That's why I'm concerned about adding to it.

We also have another possibility now:  a pathname generator.  Then the funky
callback and mystery-arg ("what's the purpose of the 'arg' arg?" is a
semi-FAQ on c.l.py) bits can go away, and client code could look like:

    for path in walk(root):
        # filter, if you like, via 'if whatever: continue'
        # accumulate state, if you like, in local vars

Or it could look like

    for top, names in walk(root):

or

    for top, dirnames, nondirnames in walk(root):


Here's an implementation of the last flavor.  Besides the more-or-less
obvious topdown argument, note a subtlety:  when topdown is True, the caller
can prune the search by mutating the dirs list yielded to it.  For example,

for top, dirs, nondirs in walk('C:/code/python'):
    print top, dirs, len(nondirs)
    if 'CVS' in dirs:
        dirs.remove('CVS')

doesn't descend into CVS subdirectories.

def walk(top, topdown=True):
    import os

    try:
        names = os.listdir(top)
    except os.error:
        return

    exceptions = ('.', '..')
    dirs, nondirs = [], []
    for name in names:
        if name in exceptions:
            continue
        fullname = os.path.join(top, name)
        if os.path.isdir(fullname):
            dirs.append(name)
        else:
            nondirs.append(name)
    if topdown:
        yield top, dirs, nondirs
    for name in dirs:
        for x in walk(os.path.join(top, name)):
            yield x
    if not topdown:
        yield top, dirs, nondirs