[Numpy-discussion] Sum, multiply are slow ?

Travis Oliphant oliphant.travis at ieee.org
Thu Jul 12 02:47:05 EDT 2007

David Cournapeau wrote:
> Hi,
>     While profiling some code, I noticed that sum in numpy is kind of 
> slow once you use axis argument:
Yes, this is expected because when using an access argument, the 
following two things can happen

1) You may be skipping over large chunks of memory to get to the next 
available number and out-of-cache memory access is slow.

2) You have to allocate a result array.

> import numpy as N
> a = N.random.randn(1e5, 30)
> %timeit N.sum(a) #-> 26.8ms
> %timeit N.sum(a, 1) #-> 65.5ms
> %timeit N.sum(a, 0) #-> 141ms
> Now, if I use some tricks, I get:
> %timeit N.sum(a) #-> 26.8 ms
> %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms
> %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms
> I realize that dot uses optimized libraries (atlas in my case) and all, 
> but is there any way to improve this situation ?
Sum does *not* use an optimized library so it is not too surprising that 
you can get speed-ups using ATLAS.  It would be nice to do something to 
optimize the reduction functions in NumPy, but nobody has come forward 
with suggestions yet.

Thanks for the reports, though.


More information about the NumPy-Discussion mailing list