Counting things fast - was Re: Summing a 2D list

Gerhard Häring gh at ghaering.de
Thu Jun 12 11:14:34 EDT 2008


Aidan wrote:
> does this work for you?
> 
> users = [1,1,1,2,2,3,4,4,4]
> score = [0,1,5,3,1,2,3,3,2]
> 
> d = dict()
> 
> for u,s in zip(users,score):
>   if d.has_key(u):
>     d[u] += s
>   else:
>     d[u] = s
> 
> for key in d.keys():
>   print 'user: %d\nscore: %d\n' % (key,d[key])

I've recently had the very same problem and needed to optimize for the 
best solution. I've tried quite a few, including:

1) using a dictionary with a default value

d = collections.defaultdict(lambda: 0)
d[key] += value

2) Trying out if avoiding object allocation is worth the effort. Using 
Cython:

cdef class Counter:
     cdef int _counter
     def __init__(self):
         self._counter = 0

     def inc(self):
         self._counter += 1

     def __int__(self):
         return self._counter

     def __iadd__(self, operand):
         self._counter += 1
         return self

And no, this was *not* faster than the final solution. This counter 
class, which is basically a mutable int, is exactly as fast as just 
using this one (final solution) - tada!

counter = {}
try:
     counter[key] += 1
except KeyError:
     counter[key] = 1

Using psyco makes this a bit faster still. psyco can't optimize 
defaultdict or my custom Counter class, though.

-- Gerhard




More information about the Python-list mailing list