Efficient way to sum a product of numbers...
Jan Kaliszewski
zuo at chopin.edu.pl
Mon Aug 31 22:28:56 CEST 2009
31-08-2009 o 18:19:28 vsoler <vicente.soler at gmail.com> wrote:
> Say
> m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
> r={'a':4, 'b':5, 'c':6}
>
> What I need is the calculation
>
> 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26
>
> That is, for each row list in variable 'm' look for its first element
> in variable 'r' and multiply the value found by the second element in
> row 'm'. After that, sum all the products.
>
> What's an efficient way to do it? I have thousands of these
> calculations to make on a big data file.
31-08-2009 o 18:30:27 Tim Chase <python.list at tim.thechases.com> wrote:
> result = sum(v * r[k] for k,v in m)
You can also check if this isn't more efficient:
from itertools import starmap
from operator import mul
result = sum(starmap(mul, ((r[name], hour) for name, hour in m)))
Or, if you had m in form of two lists:
names = ['a', 'b', 'a']
hours = [1, 2, 3]
...then you could do:
from itertools import imap as map # <- remove if you use Py3.x
from operator import mul
result = sum(map(mul, map(r.__getitem__, names), hours))
Cheers,
*j
PS. I've done a quick test on my computer (Pentium 4, 2.4Ghz, Linux):
>>> setup = "from itertools import starmap, imap ; from operator import
>>> mul; import random, string; names =
>>> [rndom.choice(string.ascii_letters) for x in xrange(10000)]; hours =
>>> [random.randint(1, 12) for x in xrange(1000)]; m = zip(names, hours);
>>> workers = set(names); r = dict(zip(workers, (random.randint(1, 10) for
>>> x in xrange(en(workers)))))"
>>> tests = (
... 'sum(v * r[k] for k,v in m)',
... 'sum(starmap(mul, ((r[name], hour) for name, hour in m)))',
... 'sum(imap(mul, imap(r.__getitem__, names), hours))',
... )
>>> for t in tests:
... print t
... timeit.repeat(t, setup, number=1000)
... print
...
sum(v * r[k] for k,v in m)
[6.2493009567260742, 6.1892399787902832, 6.2634339332580566]
sum(starmap(mul, ((r[name], hour) for name, hour in m)))
[9.3293819427490234, 10.280816078186035, 9.2766909599304199]
sum(imap(mul, imap(r.__getitem__, names), hours))
[5.7341709136962891, 5.5898380279541016, 5.7318859100341797]
--
Jan Kaliszewski (zuo) <zuo at chopin.edu.pl>
More information about the Python-list
mailing list