On Fri, Feb 16, 2024 at 12:40 AM Marten van Kerkwijk <mhvk@astro.utoronto.ca> wrote:
> From my experience, calling methods is generally faster than
> functions. I figure it is due to having less overhead figuring out the
> input. Maybe it is not significant for large data, but it does make a
> difference even when working for medium sized arrays - say float size
> 5000.
>
> %timeit a.sum()
> 3.17 µs
> %timeit np.sum(a)
> 5.18 µs

It is more that np.sum checks if there is a .sum() method and if so
calls that.  And then `ndarray.sum()` calls `np.add.reduce(array)`.

Also note that np.sum does a bunch of work *in pure Python*. Some of that Python code is really bad too, using `_wrapreduction` which has weird semantics (trying `getattr(x, 'sum')` for any object) that we could/should remove and that currently make the function even slower.

The large gap in performance has little to do with functions vs. methods, more like the method being implemented in C and not having to defer to the function, rather than the other way around.

Cheers,
Ralf

 
In [2]: a = np.arange(5000.)

In [3]: %timeit np.sum(a)
3.89 µs ± 411 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit a.sum()
2.43 µs ± 42 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [5]: %timeit np.add.reduce(a)
2.33 µs ± 31 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Though I must admit I'm a bit surprised the excess is *that* large for
using np.sum...  There may be a little micro-optimization to be found...

-- Marten
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: ralf.gommers@gmail.com