[Tutor] How to parse and extract data from a log file?

John Fouhy john at fouhy.net
Wed Aug 8 01:07:48 CEST 2007

On 08/08/07, Tim Finley <gofinner at hotmail.com> wrote:
> I'm a newbie to programming and am trying to learn Python.  Maybe I'm wrong,
> but I thought a practical way of learning it would be to create a script.  I
> want to automate the gathering of mailbox statistics for users in a post
> office.  There are two lines containing this information for each user.  I
> want to find the two lines for each user and place the information in a
> different file.  I can't figure out how to find the information I'm after.
> Can you provide me an example or refer me to some place that has it?

Hi Tim,

My first step in approaching a problem like this would probably be to
parse the data.  "parsing" means taking text data and adding structure
to it.

For example, suppose I had a data file "employees.csv" that looks like this:


where the data format is: id, first name, surname, job

I might proceed like this:

employees = {}        # This is a dictionary.  I will use this to
store the parsed information.
infile = open('employees.csv')      # open the file for reading

for line in infile:      # go through the input file, one line at a time
    line = line.strip()     # remove the newline character at the end
of each line
    id, first, last, job = line.split(',')       # split up line
around comma characters
    employees[int(id)] = { 'first':first, 'last':last, 'job':job }
# store data in dictionary

Once we get to here, we can do things like this:

# What is employee 3's name?
print employees[3]['first'], employees[3]['last']

# What is employee 1's job?
print employees[1]['job']

This might not be the best data structure for you; it depends on what
your data looks like and what you want to do with it.  Python also has
lists and tuples.  I encourage you to go through the tutorial :-)


More information about the Tutor mailing list