Array of dict or lists or ....?

Tim Chase python.list at tim.thechases.com
Mon Oct 6 20:34:01 EDT 2008


> I can't figure out how to set up a Python data structure to read in data 
> that looks something like this (albeit somewhat simplified and contrived):
> 
> States
>     Counties
>       Schools
>         Classes
>            Max Allowed Students
>            Current enrolled Students
> 
> Nebraska, Wabash, Newville, Math, 20, 0
> Nebraska, Wabash, Newville, Gym, 400, 0
> Nebraska, Tingo,  Newfille, Gym, 400, 0
> Ohio, Dinger, OldSchool, English, 10, 0
> 
> With each line I read in, I would create a hash entry and increment the 
> number of enrolled students.

A python version of what you describe:

   class TooManyAttendants(Exception): pass
   class Attendence(object):
     def __init__(self, max):
       self.max = int(max)
       self.total = 0
     def accrue(self, other):
       self.total += int(other)
       if self.total > self.max: raise TooManyAttendants
     def __str__(self):
       return "%s/%s" % (self.max, self.total)
     __repr__ = __str__

   data = {}
   for i, line in enumerate(file("input.txt")):
     print line,
     state, county, school, cls, max_students, enrolled = map(
       lambda s: s.strip(),
       line.rstrip("\r\n").split(",")
       )
     try:
       data.setdefault(
         state, {}).setdefault(
         county, {}).setdefault(
         cls, Attendence(max_students)).accrue(enrolled)
     except TooManyAttendants:
       print "Too many Attendants in line %i" % (i + 1)
   print repr(data)


You can then access things like

   a = data["Nebraska"]["Wabash"]["Newville"]["Math"]
   print a.max, a.total

If capitalization varies, you may have to do something like

   data.setdefault(
    state.upper(), {}).setdefault(
    county.upper(), {}).setdefault(
    cls.upper(), Attendence(max_students)).accrue(enrolled)

to make sure they're normalized into the same groupings.

-tkc









More information about the Python-list mailing list