newbie: datastructure `dictionary' question

John Machin sjmachin at lexicon.net
Sat Sep 9 12:00:35 EDT 2006


jason wrote:
> Hello,
>
> I am completely new to python and I have question that I unfortunately
> could not find in the various documentation online. My best guess is
> that the answer should be quitte easy but I have just enterd the learning
> phase so that means a hightend chance for stupidity and mistakes on my
> part.
>
> I am trying to fill a nested dictionary from parsing a logfile. However
> each time there is only one key entry created and that's it. Just
> one entry, while the keys are different. That's 100% sure. I think
> therefore that it is an assignment error in my part. [there we have it...]
>
> To give an static example of the datastructure that I am using to clear
> any confusion on the datastructure part:
>
>     records = { 'fam/jason-a' : {
>         'date'    : 'Fri Sep  8 16:45:55 2006',
>         'from'    : 'jason',
>         'subject' : 'Re: Oh my goes.....',
>         'msize'   : '237284' },
>                 'university/solar-system' : {
>         'date'    : 'Fri Sep  8 16:45:46 2006',
>         'from'    : 'jd',
>         'subject' : 'Vacancies for students',
>         'msize'   : '9387' }
>     }
>
> Looping over this datastructure is no problem.
>     rkeys = ['date', 'from', 'subject', 'msize']
>     for folder in records.keys():
>         print '--'
>         print folder
>         for key in rkeys:
>             print records[folder][key]
>
> Now for the actual program/dynamic part - assignment in the loop I use the
> following function. Note `datum' is not a date object, just a string.
>
> def parselog(data):
>     other = 0
>     records = {}
>
>     for line in string.split(data, '\n'):
>         str = line.strip()
>         if str[:4] == 'From':
>             mfrom, datum = extrfrom(str), extrdate(str)
>             print datum, mfrom
>         elif str[:4] == 'Fold':
>             folder = extrfolder(str[8:])
>             records = {folder : { 'date' : datum, 'mesgbytes' : extrmsize(str[8:]), 'mesgcount' : 1}}

You are *assigning* records = blahblah each time around. "records" will
end up being bound to the blahblah related to the *last* record that
you read.

You can do it item by item:
    records[folder]['date'] = datum
    etc
or as a oneliner:
    records[folder] = {'date' : datum, 'mesgbytes' :
extrmsize(str[8:]), 'mesgcount' : 1}

When you find yourself using a dictionary with constant keys like
'date', it's time to start thinking OO.

class LogMessage(object):
   def __init__(self, date, .....)
        self.date = date
        etc

then later:

records[folder] = LogMessage(
                              date=datum,
                              mesgbytes= extrmsize(str[8:]),
                              mesgcount=1,
                              )


[snip]

HTH,
John




More information about the Python-list mailing list