Optimizing code

Gerrit Holl gerrit.holl at pobox.com
Thu Feb 24 10:19:30 EST 2000


Hello,

I have the following script:

#! /usr/bin/env python

import sys
import os

class DiskUsage:
    __size = 0
    def add(self, filename):
        self.__size = self.__size + os.path.getsize(filename)
    def __call__(self, arg, d, files):
        for file in files:
            filename = os.path.join(d, file)
            if os.path.isfile(filename): self.add(filename)

    def __len__(self):
        return self.__size

def du(dir):
    disk = DiskUsage()
    os.path.walk(dir, disk, ())
    return len(disk)

def main():
    if len(sys.argv) != 2:
        sys.stderr.write("usage: %s <filename>" % sys.argv[0])
        sys.exit(1)
    print du(sys.argv[1])

if __name__ == '__main__':
    main()

Timing turns out that the 'os.path.walk' part takes about 2.7 seconds, for
a 400 MB dir with 1096 dirs and 9082 files. 'du -s ~' takes 0.2 seconds.
What makes this slow? The special methods? The redefinition of an integer?
os.path.walk? With longs, it even takes 12 seconds...

Can I optimize it? If so how?

regards,
Gerrit.

P.S.
I know it's _easier_ to do os.popen('du') but 1) it's not crossplatform and
2) optimizing is instructive.

-- 
Comparison Python GUI's: http://www.nl.linux.org/~gerrit/gui.html
Please comment!




More information about the Python-list mailing list