[perl-python] 20050127 traverse a dir

Xah Lee xah at xahlee.org
Thu Jan 27 20:37:05 CET 2005

# -*- coding: utf-8 -*-
# Python

suppose you want to walk into a directory, say, to apply a string
replacement to all html files. The os.path.walk() rises for the

© import os
© mydir= '/Users/t/Documents/unix_cilre/python'
© def myfun(s1, s2, s3):
©      print s2 # current dir
©      print s3 # list of files there
©      print '------==(^_^)==------'
© os.path.walk(mydir, myfun, 'somenull')

os.path.walk(base_dir,f,arg) will walk a dir tree starting at
base_dir, and whenever it sees a directory (including base_dir), it
will call f(arg,current_dir,children), where the current_dir is the
string of the current directory, and children is a *list* of all
children of the current directory. That is, a list of strings that are
file names and directory names. Try the above and you'll see.

now, suppose for each file ending in .html we want to apply function
g to it. So, when ever myfun is called, we need to loop thru the
children list, find files and ending in html (and not a directory),
then call g. Here's the code.

© import os
© mydir= '/Users/t/web/SpecialPlaneCurves_dir'
© def g(s): print "g touched:", s
© def myfun(dummy, dirr, filess):
©      for child in filess:
©          if '.html' == os.path.splitext(child)[1] \
©                  and os.path.isfile(dirr+'/'+child):
©              g(dirr+child)
© os.path.walk(mydir, myfun, 3)

note that os.path.splitext splits a string into two parts, a portion
before the last period, and the rest in the second portion. Effectively

it is used for getting file suffix. And the os.path.isfile() make sure
that this is a file not a dir with .html suffix... Test it yourself.

one important thing to note: in the mydir, it must not end in a
slash. One'd think Python'd take care of such trivia but no. This took
me a while to debug.

also, the way of the semantics of os.path.walk is nice. The myfun can
be a recursive function, calling itself, crystalizing a program's

# in Perl, similar program can be had.
# the prototypical way to traverse a dir
# is thru File::Find;

use File::Find qw(find);
$mydir= '/Users/t/web/SpecialPlaneCurves_dir';
find(\&wanted, $mydir);
sub g($){print shift, "\n";}
sub wanted {
if ($_ =~/\.html$/ && -T $File::Find::name) { g $File::Find::name;}

# the above showcases a quick hack.
# File::Find is one of the worst module
# there is in Perl. One cannot use it
# with a recursive (so-called) "filter"
# function. And because the way it is
# written, one cannot make the filter
# function purely functional. (it relies
# on the $_) And the filter function
# must come in certain order. (for
# example, the above program won't work
# if g is moved to the bottom.)  ...

# the quality of modules in Perl are
# all like that.
 xah at xahlee.org

More information about the Python-list mailing list