Thanks Julian, those seem like Nice improvements. The fact that it either does or doesnt work depending on the axis makes me a Little queesy; but yeah, checking that fp's do what You think they should, is unfortunately best left as the responsibility of the programmer.

From: Julian Taylor
Sent: ‎24-‎7-‎2014 14:56
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

On Thu, Jul 24, 2014 at 1:33 PM, Fabien <fabien.maussion@gmail.com> wrote:
> Hi all,
>
> On 24.07.2014 11:59, Eelco Hoogendoorn wrote:
>> np.mean isn't broken; your understanding of floating point number is.
>
> I am quite new to python, and this problem is discussed over and over
> for other languages too. However, numpy's summation problem appears with
> relatively small arrays already:
>
> py>import numpy as np
> py>np.ones((4000,4000), np.float32).mean()
> 1.0
> py>np.ones((5000,5000), np.float32).mean()
> 0.67108864000000001
>
> A 5000*5000 image is not unusual anymore today.
>
> In IDL:
> IDL> mean(fltarr(5000L, 5000L)+1)
>         1.0000000
> IDL> mean(fltarr(7000L, 7000L)+1)
>         1.0000000
> IDL> mean(fltarr(10000L, 10000L)+1)
>        0.67108864
>
> I can't really explain why there are differences between the two
> languages (IDL uses 32-bit, single-precision, floating-point numbers)
>
> Fabien
>

something as simple as summation is already an interesting algorithmic
problem there are several ways do to with different speeds and
accuracies. E.g. pythons math.fsum is always exact to one ulp but is
very slow as it requires partial sorting. Then there is kahan
summation that has an accuracy of O(1) ulp but its about 4 times
slower than the naive sum.
In practice one of the better methods is pairwise summation that is
pretty much as fast as a naive summation but has an accuracy of
O(logN) ulp.
This is the method numpy 1.9 will use this method by default (+ its
even a bit faster than our old implementation of the naive sum):
https://github.com/numpy/numpy/pull/3685

but it has some limitations, it is limited to blocks fo the buffer
size (8192 elements by default) and does not work along the slow axes
due to limitations in the numpy iterator.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion