os.walk walks too much
Edward C. Jones
edcjones at erols.com
Wed Feb 25 09:53:58 EST 2004
Marcello Pietrobon wrote:
> Hello,
> I am using Pyton 2.3
> I desire to walk a directory without recursion
I am not sure what this means. Do you want to iterate over the
non-directory files in directory top? For this job I would use:
def walk_files(top):
names = os.listdir(top)
for name in names:
if os.path.isfile(name):
yield name
> this only partly works:
> def walk_files() :
> for root, dirs, files in os.walk(top, topdown=True):
> for filename in files:
> print( "file:" + os.path.join(root, filename) )
> for dirname in dirs:
> dirs.remove( dirname )
> because it skips all the subdirectories but one.
Replace
for dirname in dirs:
dirs.remove( dirname )
with
for i in range(len(dirs)-1, -1, -1):
del dirs[i]
to make it work. Run
seq = [0,1,2,3,4,5]
for x in seq:
seq.remove(x)
print seq
to see the problem. If you are iterating through a list selectively
removing members, you should iterate in reverse. Never change the
positions in the list of elements that have not yet been reached by the
iterator.
> this *does not* work at all
> def walk_files() :
> for root, dirs, files in os.walk(top, topdown=True):
> for filename in files:
> print( "file:" + os.path.join(root, filename) )
> dirs = []
There is a subtle point in the documentation.
"When topdown is true, the caller can modify the dirnames list in-place
(perhaps using del or slice assignment), and walk() will only recurse
into the subdirectories whose names remain in dirnames; ..."
The key word is "in-place". "dirs = []" does not change "dirs" in-place.
It replaces "dirs" with a different list. Either use "del"
for i in range(len(dirs)-1, -1, -1):
del dirs[i]
as I did above or use "slice assignment"
dirs[:] = []
More information about the Python-list
mailing list