On Fri, Feb 16, 2024 at 12:40 AM Marten van Kerkwijk <mhvk@astro.utoronto.ca> wrote:
From my experience, calling methods is generally faster than functions. I figure it is due to having less overhead figuring out the input. Maybe it is not significant for large data, but it does make a difference even when working for medium sized arrays - say float size 5000.
%timeit a.sum() 3.17 µs %timeit np.sum(a) 5.18 µs
It is more that np.sum checks if there is a .sum() method and if so calls that. And then `ndarray.sum()` calls `np.add.reduce(array)`.
Also note that np.sum does a bunch of work *in pure Python*. Some of that Python code is really bad too, using `_wrapreduction` which has weird semantics (trying `getattr(x, 'sum')` for any object) that we could/should remove and that currently make the function even slower. The large gap in performance has little to do with functions vs. methods, more like the method being implemented in C and not having to defer to the function, rather than the other way around. Cheers, Ralf
In [2]: a = np.arange(5000.)
In [3]: %timeit np.sum(a) 3.89 µs ± 411 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [4]: %timeit a.sum() 2.43 µs ± 42 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [5]: %timeit np.add.reduce(a) 2.33 µs ± 31 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Though I must admit I'm a bit surprised the excess is *that* large for using np.sum... There may be a little micro-optimization to be found...
-- Marten _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ralf.gommers@gmail.com