Counting things fast - was Re: Summing a 2D list
Gerhard Häring
gh at ghaering.de
Thu Jun 12 11:14:34 EDT 2008
Aidan wrote:
> does this work for you?
>
> users = [1,1,1,2,2,3,4,4,4]
> score = [0,1,5,3,1,2,3,3,2]
>
> d = dict()
>
> for u,s in zip(users,score):
> if d.has_key(u):
> d[u] += s
> else:
> d[u] = s
>
> for key in d.keys():
> print 'user: %d\nscore: %d\n' % (key,d[key])
I've recently had the very same problem and needed to optimize for the
best solution. I've tried quite a few, including:
1) using a dictionary with a default value
d = collections.defaultdict(lambda: 0)
d[key] += value
2) Trying out if avoiding object allocation is worth the effort. Using
Cython:
cdef class Counter:
cdef int _counter
def __init__(self):
self._counter = 0
def inc(self):
self._counter += 1
def __int__(self):
return self._counter
def __iadd__(self, operand):
self._counter += 1
return self
And no, this was *not* faster than the final solution. This counter
class, which is basically a mutable int, is exactly as fast as just
using this one (final solution) - tada!
counter = {}
try:
counter[key] += 1
except KeyError:
counter[key] = 1
Using psyco makes this a bit faster still. psyco can't optimize
defaultdict or my custom Counter class, though.
-- Gerhard
More information about the Python-list
mailing list