[Tutor] directory size

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Fri, 2 Aug 2002 13:00:53 -0700 (PDT)


On Fri, 2 Aug 2002, Klaus Boehm wrote:

> How can i determine the size of a directory?
>  In Linux there is a command like " du -hc ." .
>  Is there a similar way in python.

So 'du -hc' tries to find the total amount of disk space that a directory
and all its subdirectories takes?  I'm not sure if this is built-in, but
we can talk about how we can write it.


That doesn't sound too bad if we define this total_disk_space() function
recursively:

    A directory takes up as much space as that of its regular files,
    plus that of all its subdirectories.


So one way to write a total_disk_space() function could be:

###
>>> def disk_usage(directory):
...     files, subdirs = get_files(directory), get_subdirs(directory)
...     sum = 0
...     for f in files: sum = sum + os.path.getsize(f)
...     for s in subdirs: sum = sum + disk_usage(s)
...     return sum
...
###

(I haven't written get_files() or get_subdirs(), but those shouldn't be
too bad.  Make sure that both functions return absolute pathnames, just to
avoid some silly problems with relative paths.)


One major problem with this approach is that we need to be careful about
symbolic links: if a symbolic link forms a loop, we may run into problems.
When we write get_files() and get_subdirs(), we may want to avoid symbolic
links by filtering those symbolic links away with os.path.islink().  Or we
can keep track which directories we've dived into already.



A variation on this recursive way of finding disk usage can use the
os.path.walk() function, which does the tricky recursion stuff for us.
If you'd like, we can give an example of how to use it.


If you have more questions, please feel free to ask!