[Numpy-discussion] Does np.std() make two passes through the data?

josef.pktd at gmail.com josef.pktd at gmail.com
Sun Nov 21 19:18:13 EST 2010


On Sun, Nov 21, 2010 at 6:43 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> Does np.std() make two passes through the data?
>
> Numpy:
>
>>> arr = np.random.rand(10)
>>> arr.std()
>   0.3008736260967052
>
> Looks like an algorithm that makes one pass through the data (one for
> loop) wouldn't match arr.std():
>
>>> np.sqrt((arr*arr).mean() - arr.mean()**2)
>   0.30087362609670526
>
> But a slower two-pass algorithm would match arr.std():
>
>>> np.sqrt(((arr - arr.mean())**2).mean())
>   0.3008736260967052
>
> Is there a way to get the same result as arr.std() in one pass (cython
> for loop) of the data?

reference several times pointed to on the list is the wikipedia page, e.g.
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm

I don't know about actual implementation.

Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list