[Tutor] Parsing webserver log files

Keith cubexican at gmail.com
Sat Jul 7 20:22:06 CEST 2007


   I'm very new to python (about a couple days now) and as part of my
internship, my supervisor wanted me to learn python and write
something that will parse webserver logs files and give a list of IPs
and the requested URL. Seeing as this needs to be a reflection of my
own work I want to write all of the program but I needed some
direction.
  I've been moving up step by step, first using regular expressions to
find IP addresses and URLs within a line of a log file and list them.
Now I'm at a point where I'll need to differentiate between lines of
the log file and generate a long list of IP and corresponding URL.
   Like I said, I'm using regular expressions to distinguish IPs and
URLs within a line and (from the way my supervisor is pointing me)
I'll need to utilize the built in functions file() and readlines()
along with the regular expressions to be able to write the final
program.
   Basically, I'm looking for some direction on how to write the
program I've been instructed to write utilizing regular expressions
and by opening the file within python using open() or file(). I don't
have any programming experience and the webserver log files are from
an Apache HTTP server and are therefore in that format.

Thanks


More information about the Tutor mailing list