[Tutor] bash find equivalent in Python

Magnus Lyckå magnus@thinkware.se
Fri Jul 11 06:41:02 2003


At 03:42 2003-07-11 +0200, Karl Pflästerer wrote:
> > Nah, os.path.walk is just the thing here!
> >
> >  >>> l = []
> >  >>> def htmlFinder(arg, dirname, fnames):
> > ...     for fn in fnames:
> > ...             if fn.endswith('.html'):
> > ...                     l.append(os.path.join(dirname, fn))
> > ...
> >  >>> os.path.walk('c:\\', htmlFinder, None)
> >  >>> for fn in l: print fn
>
>That is pretty short but I always found os.path.walk a little bit slow.

True! But it was fast to write! ;)

Have you tried profiling and improving walk? The code is
very small and simple, but I'm sure it could be improved.
Making os.profile.walk better will benefit all of us, and
let you continue to write very small and simple functions.

There is a cost you can't avoid in the function call for
each directory, but compared to the speed of disk accesses,
that seems rather tiny. It's also doing 'isdir()' on each
file. Is that a resourse hog? But you do that as well!

def walk(top, func, arg):
     # Extracted from ntpath.py. Doc string removed
     try:
         names = os.listdir(top)
     except os.error:
         return
     func(arg, top, names)
     exceptions = ('.', '..')
     for name in names:
         if name not in exceptions:
             name = join(top, name)
             if isdir(name):
                 walk(name, func, arg)

By the way, a more generic version of my program would use regular
expressions. The version below prints the found files at once instead
of putting them in a list.

 >>> def fnFinder(reg_exp, dirname, fnames):
...     for fn in fnames:
...             if reg_exp.search(fn):
...                     print os.path.join(dirname, fn)
...
 >>> import re
 >>> os.path.walk('C:\\', fnFinder, re.compile('\.html$'))

Another version is obviously to supply the list to put the file
names in.

 >>> def fnFinder((aList, reg_exp), dirname, fnames):
...     for fn in fnames:
...             if reg_exp.search(fn):
...                     aList.append(os.path.join(dirname, fn))
...
 >>> l = []
 >>> os.path.walk('p:/tmp/', fnFinder, (l, re.compile('\.html$')))
 >>> print l
['p:/tmp/index.html', 'p:/tmp/install_guide_2.0.html']
 >>>


--
Magnus Lycka (It's really Lyckå), magnus@thinkware.se
Thinkware AB, Sweden, www.thinkware.se
I code Python ~ The Agile Programming Language