[Tutor] List processing question - consolidating duplicate entries

bob gailer bgailer at alum.rpi.edu
Tue Nov 27 23:12:30 CET 2007


Richard Querin wrote:
> I'm trying to process a list and I'm stuck. Hopefully someone can help
> me out here:
>
> I've got a list that is formatted as follows:
> [Name,job#,jobname,workcode,hours]
>
> An example might be:
>
> [Bob,07129,projectA,4001,5]
> [Bob,07129,projectA,5001,2]
> [Bob,07101,projectB,4001,1]
> [Bob,07140,projectC,3001,3]
> [Bob,07099,projectD,3001,2]
> [Bob,07129,projectA,4001,4]
> [Bob,07099,projectD,4001,3]
> [Bob,07129,projectA,4001,2]
>
> Now I'd like to consolidate entries that are duplicates. Duplicates
> meaning entries that share the same Name, job#, jobname and workcode.
> So for the list above, there are 3 entries for projectA which have a
> workcode of 4001. (there is a fourth entry for projectA but it's
> workcode is 5001 and not 4001).
>
> So I'd like to end up with a list so that the three duplicate entries
> are consolidated into one with their hours added up:
>
> [Bob,07129,projectA,4001,11]
> [Bob,07129,projectA,5001,2]
> [Bob,07101,projectB,4001,1]
> [Bob,07140,projectC,3001,3]
> [Bob,07099,projectD,3001,2]
> [Bob,07099,projectD,4001,3]
There are at least 2 more approaches.

1 - Use sqlite (or some other database), insert the data into the 
database, then run a sql statement to sum(hours) group by name, project, 
workcode.

2 - Sort the list. Create a new list with an entry for the first name, 
project, workcode. Step thru the list. Each time the name, project, 
workcode is the same, accumulate hours. When any of those change, create 
a list entry for the next name, project, workcode and again start 
accumulating hours.

The last is IMHO the most straightforward, and easiest to code.


More information about the Tutor mailing list