[Numpy-discussion] Does np.std() make two passes through the data?
josef.pktd at gmail.com
josef.pktd at gmail.com
Sun Nov 21 19:18:13 EST 2010
On Sun, Nov 21, 2010 at 6:43 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> Does np.std() make two passes through the data?
>
> Numpy:
>
>>> arr = np.random.rand(10)
>>> arr.std()
> 0.3008736260967052
>
> Looks like an algorithm that makes one pass through the data (one for
> loop) wouldn't match arr.std():
>
>>> np.sqrt((arr*arr).mean() - arr.mean()**2)
> 0.30087362609670526
>
> But a slower two-pass algorithm would match arr.std():
>
>>> np.sqrt(((arr - arr.mean())**2).mean())
> 0.3008736260967052
>
> Is there a way to get the same result as arr.std() in one pass (cython
> for loop) of the data?
reference several times pointed to on the list is the wikipedia page, e.g.
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm
I don't know about actual implementation.
Josef
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list