[Numpy-discussion] Fastest way to compute summary statistics for a specific axis

Oscar Benjamin oscar.j.benjamin at gmail.com
Mon Mar 16 12:04:30 EDT 2015

On 16 March 2015 at 15:53, Dave Hirschfeld <dave.hirschfeld at gmail.com> wrote:
> I have a number of large arrays for which I want to compute the mean and
> standard deviation over a particular axis - e.g. I want to compute the
> statistics for axis=1 as if the other axes were combined so that in the
> example below I get two values back
> In [1]: a = randn(30, 2, 10000)
> Both methods are however significantly slower than the initial attempt:
> In [9]: %timeit a.mean(0).mean(-1)
> 1000 loops, best of 3: 1.2 ms per loop
> Perhaps because it allocates a smaller temporary?
> For those who like a challenge: is there a faster way to achieve what
> I'm after?

You'll probably find it faster if you swap the means around to make an
even smaller temporary:



More information about the NumPy-Discussion mailing list