[Python-Dev] tally (and other accumulators)

Alex Martelli aleaxit at gmail.com
Tue Apr 4 16:52:28 CEST 2006

It's a bit late for 2.5, of course, but, I thought I'd propose it  
anyway -- I noticed it on c.l.py.

In 2.3/2.4 we have many ways to generate and process iterators but  
few "accumulators" -- functions that accept an iterable and produce  
some kind of "summary result" from it.  sum, min, max, for example.  
And any, all in 2.5.

The proposed function tally accepts an iterable whose items are  
hashable and returns a dict mapping each item to its count (number of  
times it appears).

This is quite general and simple at the same time: for example, it  
was proposed originally to answer some complaint about any and all  
giving no indication of the count of true/false items:

tally(bool(x) for x in seq)

would give a dict with two entries, counts of true and false items.

Just like the other accumulators mentioned above, tally is simple to  
implement, especially with the new collections.defaultdict:

import collections
def tally(seq):
     d = collections.defaultdict(int)
     for item in seq:
         d[item] += 1
     return dict(d)

Nevertheless, simplicity and generality make it advisable to supply  
it as part of the standard library (location TBD).

A good alternative would be a classmethod tally within  
collections.defaultdict, building and returning a defaultdict as  
above (with a .factory left to int, for further possible use as a  
'bag'/multiset data structure); this would solve the problem of where  
to locate tally if it were to be a function.  defaultdict.tally would  
be logically quite similar to dict.fromkeys, except that keys  
happening repeatedly get counted (and so each associated to a value  
of 1 and upwards) rather than "collapsed".


More information about the Python-Dev mailing list