[Numpy-discussion] Fastest way to compute summary statistics for a specific axis

Eric Moore ewm at redtetrahedron.org
Mon Mar 16 12:10:21 EDT 2015


On Mon, Mar 16, 2015 at 11:53 AM, Dave Hirschfeld <dave.hirschfeld at gmail.com
> wrote:

> I have a number of large arrays for which I want to compute the mean and
> standard deviation over a particular axis - e.g. I want to compute the
> statistics for axis=1 as if the other axes were combined so that in the
> example below I get two values back
>
> In [1]: a = randn(30, 2, 10000)
>
> For the mean this can be done easily like:
>
> In [2]: a.mean(0).mean(-1)
> Out[2]: array([ 0.0007, -0.0009])
>
>
> ...but this won't work for the std. Using some transformations we can
> come up with something which will work for either:
>
> In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
> Out[3]: array([ 0.0007, -0.0009])
>
> In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
> Out[4]: array([ 0.0007, -0.0009])
>
>
Specify all of the axes you want to reduce over as a tuple.

In [1]: import numpy as np


In [2]: a = np.random.randn(30, 2, 10000)


In [3]: a.mean(axis=(0,-1))

Out[3]: array([-0.00224589, 0.00230759])


In [4]: a.std(axis=(0,-1))

Out[4]: array([ 1.00062771, 1.0001258 ])



-Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150316/ac8b378c/attachment.html>


More information about the NumPy-Discussion mailing list