[Python-Dev] tally (and other accumulators)
aleaxit at gmail.com
Tue Apr 4 16:52:28 CEST 2006
It's a bit late for 2.5, of course, but, I thought I'd propose it
anyway -- I noticed it on c.l.py.
In 2.3/2.4 we have many ways to generate and process iterators but
few "accumulators" -- functions that accept an iterable and produce
some kind of "summary result" from it. sum, min, max, for example.
And any, all in 2.5.
The proposed function tally accepts an iterable whose items are
hashable and returns a dict mapping each item to its count (number of
times it appears).
This is quite general and simple at the same time: for example, it
was proposed originally to answer some complaint about any and all
giving no indication of the count of true/false items:
tally(bool(x) for x in seq)
would give a dict with two entries, counts of true and false items.
Just like the other accumulators mentioned above, tally is simple to
implement, especially with the new collections.defaultdict:
d = collections.defaultdict(int)
for item in seq:
d[item] += 1
Nevertheless, simplicity and generality make it advisable to supply
it as part of the standard library (location TBD).
A good alternative would be a classmethod tally within
collections.defaultdict, building and returning a defaultdict as
above (with a .factory left to int, for further possible use as a
'bag'/multiset data structure); this would solve the problem of where
to locate tally if it were to be a function. defaultdict.tally would
be logically quite similar to dict.fromkeys, except that keys
happening repeatedly get counted (and so each associated to a value
of 1 and upwards) rather than "collapsed".
More information about the Python-Dev