# [Numpy-discussion] Fastest way to compute summary statistics for a specific axis

Dave Hirschfeld dave.hirschfeld at gmail.com
Mon Mar 16 11:53:08 EDT 2015

```I have a number of large arrays for which I want to compute the mean and
standard deviation over a particular axis - e.g. I want to compute the
statistics for axis=1 as if the other axes were combined so that in the
example below I get two values back

In : a = randn(30, 2, 10000)

For the mean this can be done easily like:

In : a.mean(0).mean(-1)
Out: array([ 0.0007, -0.0009])

...but this won't work for the std. Using some transformations we can
come up with something which will work for either:

In : a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
Out: array([ 0.0007, -0.0009])

In : a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
Out: array([ 0.0007, -0.0009])

If we look at the performance of these equivalent methods:

In : %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
100 loops, best of 3: 14.5 ms per loop

In : %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
100 loops, best of 3: 5.05 ms per loop

we can see that the latter version is a clear winner. Investigating
further, both methods appear to copy the data so the performance is
likely down to better cache utilisation.

In : np.may_share_memory(a, a.transpose(2,0,1).reshape(-1, 2))
Out: False

In : np.may_share_memory(a, a.transpose(1,0,2).reshape(2, -1))
Out: False

Both methods are however significantly slower than the initial attempt:

In : %timeit a.mean(0).mean(-1)
1000 loops, best of 3: 1.2 ms per loop

Perhaps because it allocates a smaller temporary?

For those who like a challenge: is there a faster way to achieve what
I'm after?

Cheers,
Dave

```