[Tutor] List processing question - consolidating duplicate entries

Tue Nov 27 23:40:41 CET 2007

bob gailer wrote:
> 2 - Sort the list. Create a new list with an entry for the first name, 
> project, workcode. Step thru the list. Each time the name, project, 
> workcode is the same, accumulate hours. When any of those change, create 
> a list entry for the next name, project, workcode and again start 
> accumulating hours.

This is a two-liner using itertools.groupby() and operator.itemgetter:

data = [['Bob', '07129', 'projectA', '4001',5],
['Bob', '07129', 'projectA', '5001',2],
['Bob', '07101', 'projectB', '4001',1],
['Bob', '07140', 'projectC', '3001',3],
['Bob', '07099', 'projectD', '3001',2],
['Bob', '07129', 'projectA', '4001',4],
['Bob', '07099', 'projectD', '4001',3],
['Bob', '07129', 'projectA', '4001',2]
]

import itertools, operator
for k, g in itertools.groupby(sorted(data), key=operator.itemgetter(0, 
1, 2, 3)):
   print k, sum(item[4] for item in g)

For some explanation see my recent post:
http://mail.python.org/pipermail/tutor/2007-November/058753.html

Kent