[Tutor] making a function work with os.path.walk()

Rich Krauter rmkrauter at yahoo.com
Fri Feb 13 21:09:02 EST 2004


On Fri, 2004-02-13 at 19:05, Christopher Spears wrote:
> I wrote a function called get_size(n,dir,files).  This
> is the code:
> 
> import os
> 
> def get_size(n,dir,files):

Don't use dir. You're masking a built-in.

>     files_of_size = {}; sizes = [];
>     biggest_files = []; files_sizes = [];

You can get rid of files_sizes. It's not used.

>     files = map(os.path.join,[dir] * len(files),files)

When debugging, it helps to put a few print statements here and there to
see what's going on. Put one here to print 'files' list.

>     files = filter(lambda x: not os.path.isdir(x)
>                    and not os.path.islink(x),
> os.listdir(dir))

And put one here to print 'files' list. Putting in those print
statements might help you see that you need to replace 'os.listdir(dir)'
in above line with 'files'. That way you don't lose your path
information.
> 
>     for f in files:
>        
> files_of_size.setdefault(os.path.getsize(f),[]).append(f)
> 
>     sizes = files_of_size.keys()
>     sizes.sort(); sizes.reverse();
>     for s in sizes:
>         biggest_files.append([files_of_size[s],s])
> 
>     while n > 0:
>         print biggest_files[n-1]
>         n = n - 1
> 

import os
def get_size(n,dir,files):
    print dir
    files_of_size = {}; sizes = [];
    biggest_files = []; 
    files = map(os.path.join,[dir] * len(files),files)
    #print files
    files = filter(lambda x: not os.path.isdir(x)
                   and not os.path.islink(x),files)
    #print files
    for f in files:
        files_of_size.setdefault(os.path.getsize(f),[]).append(f)
        sizes = files_of_size.keys()
        sizes.sort(); sizes.reverse();
    for s in sizes:
        biggest_files.append([files_of_size[s],s])

    for f in biggest_files[:3]:
        print f


if __name__ == '__main__':
    os.path.walk('/tmp',get_size,3)


I ran this on my /tmp directory. One of the directories in there had
about 1000 0-byte files, and nothing larger than that. So I got back a
huge list of 0 byte files for that dir. That behavior may or may not be
what you want.

Hope that helps.
Rich





More information about the Tutor mailing list