[Tutor] Parsing webserver log files
Alan Gauld
alan.gauld at btinternet.com
Sat Jul 7 21:12:10 CEST 2007
"Keith" <cubexican at gmail.com> wrote
> I've been moving up step by step, first using regular expressions
> to
> find IP addresses and URLs within a line of a log file and list
> them.
It sounds like you are on the right lines.
> Now I'm at a point where I'll need to differentiate between lines of
> the log file and generate a long list of IP and corresponding URL.
If you put your code for processing a single line into a function
then you can just iterate over the log file using a for loop.
Use your function to extract the IP and URL and then write
those to a report or just append to a list.
> I'll need to utilize the built in functions file() and readlines()
You may not need readlines because files are now iterable so you can
just do:
for line in file('logfile.log'):
processLine(line)
> and by opening the file within python using open() or file(). I
> don't
> have any programming experience and the webserver log files are from
> an Apache HTTP server and are therefore in that format.
You are heading the right way. Use functions to package up buits
of discrete functionality and write a high level program using those
functions.
HTH
--
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld
More information about the Tutor
mailing list