itertools.groupby usage to get structured data

Slafs slafs.e at gmail.com
Fri Feb 4 18:14:24 EST 2011


Hi there!

I'm having trouble to wrap my brain around this kind of problem:

What I have :
  1) list of dicts
  2) list of keys that i would like to be my grouping arguments of
elements from 1)
  3) list of keys that i would like do "aggregation" on the elements
of 1) with some function e.g. sum

For instance i got:
1) [ { 'g1' : 1, 'g2' : 8, 's_v1' : 5.0, 's_v2' : 3.5 },
      { 'g1' : 1, 'g2' : 9, 's_v1' : 2.0, 's_v2' : 3.0 },
      {'g1' : 2, 'g2' : 8, 's_v1' : 6.0, 's_v2' : 8.0}, ... ]
2) ['g1', 'g2']
3) ['s_v1', 's_v2']

To be precise 1) is a result of a values_list method from a QuerySet
in Django; 2) is the arguments for that method; 3) those are the
annotation keys. so 1) is a result of:
   qs.values_list('g1', 'g2').annotate(s_v1=Sum('v1'), s_v2=Sum('v2'))

What i want to have is:
a "big" nested dictionary with 'g1' values as 1st level keys and a
dictionary of aggregates and "subgroups" in it.

In my example it would be something like this:
{
  1 : {
          's_v1' : 7.0,
          's_v2' : 6.5,
          'g2' :{
                   8 : {
                          's_v1' : 5.0,
                          's_v2' : 3.5 },
                   9 :  {
                          's_v1' : 2.0,
                          's_v2' : 3.0 }
                }
       },
  2 : {
           's_v1' : 6.0,
           's_v2' : 8.0,
           'g2' : {
                    8 : {
                          's_v1' : 6.0,
                          's_v2' : 8.0}
           }
       },
...
}

# notice the summed values of s_v1 and s_v2 when g1 == 1

I was looking for a solution that would let me do that kind of
grouping with variable lists of 2) and 3) i.e. having also 'g3' as
grouping element so the 'g2' dicts could also have their own
"subgroup" and be even more nested then.
I was trying something with itertools.groupby and updating nested
dicts, but as i was writing the code it started to feel too verbose to
me :/

Do You have any hints maybe? because i'm kind of stucked :/

Regards

Sławek



More information about the Python-list mailing list