python vs awk for simple sysamin tasks

Steve Lamb grey at
Sat Jun 5 01:43:06 CEST 2004

On 2004-06-04, William Park <opengeometry at> wrote:
> 4x faster?  Not very impressive.  I suspect that it's poor quality shell
> script to begin with.  Would you post this script, so that others can
> correct your misguided perception?


1: It was an internal script for statistics gathering and I did not have
permission to expose that code to the public.

2: Even if I did I no longer work there.

    The just of it though was that it was a disk usage script which tabulated
usage for a few hundred thousand customers.  It had to go through several
slices (it wasn't a single large directory) find the customers in each of
those slices, calculate their disk usage and create a log of it.

    The Perl recode came about when management wanted some exclusions put in
and the shell script was breaking at that point.  They also wanted a lower
run-time if possible.  So I spent an hour or two, most of it in the recursion
walk across the file-system (thank dog Python has os.path.walk!) rewriting it
in Perl.  The stat calls were not reduced, we still had to do a stat on every
file to get the size as before.  However we no longer were going through the
overhead of constantly opening and closing pipes to/from du as well as the
numerous exec calls.

    4x faster measured in hours based pretty much on building up and tearing
down those pipes and executing the same program over and over is rather
impressive given how miniscule those operations are in general is impressive. 

         Steve C. Lamb         | I'm your priest, I'm your shrink, I'm your
       PGP Key: 8B6E99C5       | main connection to the switchboard of souls.

More information about the Python-list mailing list