Clay Shirky clay at shirky.com
Tue Nov 11 18:58:45 EST 2003

So I have a webserver log, and need to post process it a bit. I wrote a
simple script in both python and perl, and the perl runs just about twice as

Here are the two scripts -- the logic is line-by-line identical, except for
one try/except idiom. My question is: what should I do to speed up the


import re

f = open("today.test") # the server log file

fields = []

for line in f:
    if re.match('^shirky.com', line): # find hits from my site
        fields = line.split()
        try: referer = fields[11] # grab the referer
        except: continue          # continue if there is a mangled line
        referer = re.sub('"', '', referer)
        if re.search("shirky", referer): continue # ignore internal links
        if re.search("-", referer):      continue # ...and email clicks
        referer = re.sub("www.", "", referer)
        print referer

# -----------------------------------------------


open (F, "today.test");

while (<F>) {
    if (/^shirky.com/) {
        @fields = split;
        $referer = $fields[11];
        $referer =~ s/"//g;
        next if $referer =~ /shirky/;
        next if $referer =~ /-/;
        $referer =~ s/www.//;
        print "$referer\n";

