os.walk help

hokiegal99 hokiegal99 at hotmail.com
Wed Nov 26 14:05:55 EST 2003


Could we discuss more about the topdown feature in os.walk? My script is 
working fine now, I have no trouble at all with it. I just want to 
better understand os.walk in Python 2.3. This is how I understand it as 
of today, someone please correct me if I'm wrong:

topdown=False would build a list of filesystem (fs) objects from the 
bottom up. The objects at the begining of the list would be the end-most 
objects (the leaf nodes) of the fs. When you make changes to that list, 
the changes would be from leaf node to os.walk's root instead of root to 
leaf node, correct? For example, if I had this dir structure:

dir_a
    file_a
    dir_b
       file_b

My list would look like this:

file_b
dir_b
file_a
dir_a

And, if I made changes to the list and commited those changes to the fs 
then there would be no problems because of the order in which the 
changes are made. Is this a proper way to describe topdown=False in 
os.walk? Or in other words, our list would be static (one change would 
not impact another), where if topdown=True our list would be dynamic 
(one change could impact another).

Thanks for the help!!!





> Robin Munn wrote:
> 
>> Wait, I just realized that you're changing the list *while* you're
>> iterating over it. That's a bad idea. See the warning at the bottom of
>> this page in the language reference:
>>
>>     http://www.python.org/doc/current/ref/for.html
>>
>> Instead of modifying the list while you're looping over it, use the
>> topdown argument to os.walk to build the tree from the bottom up instead
>> of from the top down. That way you won't have to futz with the dirnames
>> list at all:
>>
>>     def clean_names(rootpath):
>>         bad = re.compile(r'%2f|%25|%20|[*?<>/\|\\]')
>>         for root, dirs, files in os.walk(rootpath, topdown=False):
>>             for dname in dirs:
>>                 newdname = re.sub(bad, '-', dname)
>>                 if newdname != dname:
>>                     newpath = os.path.join(root, newdname)
>>                     oldpath = os.path.join(root, dname)
>>                     os.renames(oldpath, newpath)
>>
>> Notice also the use of re.sub to do all the character substitutions at
>> once. Your code as written would have failed on a filename like "foo*?",
>> since it always renamed from the original filename: it would have first
>> done os.renames("foo*?", "foo-?") followed by os.renames("foo*?",
>> "foo--") and the second would have raised an OSError.
>>
> 
> 






More information about the Python-list mailing list