[Numpy-discussion] Optimized sum of squares

Sun Oct 18 12:06:49 EDT 2009

On Sun, Oct 18, 2009 at 8:09 AM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Sun, Oct 18, 2009 at 09:06:15PM +1100, Gary Ruben wrote:
>> Hi Gaël,
>
>> If you've got a 1D array/vector called "a", I think the normal idiom is
>
>> np.dot(a,a)
>
>> For the more general case, I think
>> np.tensordot(a, a, axes=something_else)
>> should do it, where you should be able to figure out something_else for
>> your particular case.
>
> Ha, yes. Good point about the tensordot trick.
>
> Thank you
>
> Gaël

I'm curious about this as I use ss, which is just np.sum(a*a, axis),
in statsmodels and didn't much think about it.

There is

import numpy as np
from scipy.stats import ss

a = np.ones(5000)

but

timeit ss(a)
10000 loops, best of 3: 21.5 µs per loop

timeit np.add.reduce(a*a)
100000 loops, best of 3: 15 µs per loop

timeit np.dot(a,a)
100000 loops, best of 3: 5.38 µs per loop

Do the number of loops matter in the timings and is dot always faster
even without the blas dot?

Skipper