[Numpy-discussion] Re: ENH: Introducing a pipe Method for Numpy arrays

15 Feb 2024

      ...
One more thing to mention on this topic.
From a certain size dot product becomes faster than sum (due to parallelisation I guess?).
E.g.
def dotsum(arr):
    a = arr.reshape(1000, 100)
    return a.dot(np.ones(100)).sum()
a = np.ones(100000)
In [45]: %timeit np.add.reduce(a, axis=None)
42.8 µs ± 2.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [43]: %timeit dotsum(a)
26.1 µs ± 718 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
But theoretically, sum, should be faster than dot product by a fair bit.
Isn’t parallelisation implemented for it?
I cannot reproduce that:

In [3]: %timeit np.add.reduce(a, axis=None)
19.7 µs ± 184 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit dotsum(a)
47.2 µs ± 360 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

But almost certainly it is indeed due to optimizations, since .dot uses
BLAS which is highly optimized (at least on some platforms, clearly
better on yours than on mine!).

I thought .sum() was optimized too, but perhaps less so?

It may be good to raise a quick issue about this!

Thanks, Marten

[Numpy-discussion] Re: ENH: Introducing a pipe Method for Numpy arrays

Marten van Kerkwijk