[Tutor] Parsing webserver log files

Alan Gauld alan.gauld at btinternet.com
Sat Jul 7 21:12:10 CEST 2007


"Keith" <cubexican at gmail.com> wrote

>  I've been moving up step by step, first using regular expressions 
> to
> find IP addresses and URLs within a line of a log file and list 
> them.

It sounds like you are on the right lines.

> Now I'm at a point where I'll need to differentiate between lines of
> the log file and generate a long list of IP and corresponding URL.

If you put your code for processing a single line into a function
then you can just iterate over the log file using a for loop.

Use your function to extract the IP and URL and then write
those to a report or just append to a list.

> I'll need to utilize the built in functions file() and readlines()

You may not need readlines because files are now iterable so you can 
just do:

for line in file('logfile.log'):
    processLine(line)

> and by opening the file within python using open() or file(). I 
> don't
> have any programming experience and the webserver log files are from
> an Apache HTTP server and are therefore in that format.

You are heading the right way. Use functions to package up buits
of discrete functionality and write a high level program using those
functions.

HTH

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 




More information about the Tutor mailing list