postprocessing in os.walk
Dave Angel
davea at ieee.org
Tue Oct 13 12:31:25 EDT 2009
Peter Otten wrote:
> kj wrote:
>
>
>> In <mailman.1196.1255347115.2807.python-list at python.org> Dave Angel
>> <davea at ieee.org> writes:
>>
>>
>>> kj wrote:
>>>
>>>> Perl's directory tree traversal facility is provided by the function
>>>> find of the File::Find module. This function accepts an optional
>>>> callback, called postprocess, that gets invoked "just before leaving
>>>> the currently processed directory." The documentation goes on to
>>>> say "This hook is handy for summarizing a directory, such as
>>>> calculating its disk usage", which is exactly what I use it for in
>>>> a maintenance script.
>>>>
>>>> This maintenance script is getting long in the tooth, and I've been
>>>> meaning to add a few enhancements to it for a while, so I thought
>>>> that in the process I'd port it to Python, using the os.walk
>>>> function, but I see that os.walk does not have anything like this
>>>> File::Find::find's postprocess hook. Is there a good way to simulate
>>>> it (without having to roll my own File::Find::find in Python)?
>>>>
>>>> TIA!
>>>>
>>>> kynn
>>>>
>>>>
>>>>
>>> Why would you need a special hook when the os.walk() generator yields
>>> exactly once per directory? So whatever work you do on the list of
>>> files you get, you can then put the summary logic immediately after.
>>>
>>> Or if you really feel you need a special hook, then write a wrapper for
>>> os.walk(), which takes a hook function as a parameter, and after
>>> yielding each file in a directory, calls the hook. Looks like about 5
>>> lines.
>>>
>> I think you're missing the point. The hook in question has to be
>> called *immediately after* all the subtrees that are rooted in
>> subdirectories contained in the current directory have been visited
>> by os.walk.
>>
>> I'd love to see your "5 lines" for *that*.
>>
>
> import os
>
> def find(root, process):
> for pdf in os.walk(root, topdown=False):
> process(*pdf)
>
> def process(path, dirs, files):
> print path
>
> find(".", process)
>
> Peter
>
>
>
>
Thanks Peter,
To expand it to five lines, and make it the generator I mentioned,
import os
def find(root, process):
for pdf in os.walk(root, topdown=False):
for file in pdf[2]:
yield os.path.join(pdf[0],file)
process(*pdf)
def process(path, dirs, files):
print "hooked --", path
for fullpath in find("..", process):
print fullpath
This is a generator which yields each file in a directory tree, and
after all the files below a particular directory are processed,
"immediately" calls the hook
DaveA
More information about the Python-list
mailing list