related lists mean value (golfed)

Peter Otten __peter__ at web.de
Tue Mar 9 17:10:26 CET 2010

```Michael Rudolf wrote:

> Am 09.03.2010 13:02, schrieb Peter Otten:
>>>>> [sum(a for a,b in zip(x,y) if b==c)/y.count(c)for c in y]
>> [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]
>> Peter
>
> ... pwned.
> Should be the fastest and shortest way to do it.

It may be short, but it is not particularly efficient. A dict-based approach
is probably the fastest. If y is guaranteed to be sorted itertools.groupby()
may also be worth a try.

\$ cat tmp_average_compare.py
from __future__ import division
from collections import defaultdict
try:
from itertools import izip as zip
except ImportError:
pass

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

def f(x=x, y=y):
p = defaultdict(int)
q = defaultdict(int)
for a, b in zip(x, y):
p[b] += a
q[b] += 1
return [p[b]/q[b] for b in y]

def g(x=x, y=y):
return [sum(a for a,b in zip(x,y)if b==c)/y.count(c)for c in y]

if __name__ == "__main__":
print(f())
print(g())
assert f() == g()
\$ python3 -m timeit -s 'from tmp_average_compare import f, g' 'f()'
100000 loops, best of 3: 11.4 usec per loop
\$ python3 -m timeit -s 'from tmp_average_compare import f, g' 'g()'
10000 loops, best of 3: 22.8 usec per loop

Peter

```