Thank you Bruce and all,

I knew I was doing something wrong (should have read the mean method doc more closely). Am of course glad that's so easy understandable.

But: If the error can get so big, wouldn't it be a better idea for the accumulator to always be of type 'float64' and then convert later to the type of the original array?

As one can see in this case, the result would be much closer to the true value.

Michael

On 2012-01-24 19:01:40 +0000, Val Kalatsky said:

Just what Bruce said.

You can run the following to confirm:

np.mean(data - data.mean())

If for some reason you do not want to convert to float64 you can add the result of the previous line to the "bad" mean:

bad_mean = data.mean()

good_mean = bad_mean + np.mean(data - bad_mean)

Val

On Tue, Jan 24, 2012 at 12:33 PM, K.-Michael Aye <kmichael.aye@gmail.com> wrote:

I know I know, that's pretty outrageous to even suggest, but please

bear with me, I am stumped as you may be:

2-D data file here:

http://dl.dropbox.com/u/139035/data.npy

Then:

In [3]: data.mean()

Out[3]: 3067.0243839999998

In [4]: data.max()

Out[4]: 3052.4343

In [5]: data.shape

Out[5]: (1000, 1000)

In [6]: data.min()

Out[6]: 3040.498

In [7]: data.dtype

Out[7]: dtype('float32')

A mean value calculated per loop over the data gives me 3045.747251076416

I first thought I still misunderstand how data.mean() works, per axis

and so on, but did the same with a flattenend version with the same

results.

Am I really soo tired that I can't see what I am doing wrong here?

For completion, the data was read by a osgeo.gdal dataset method called

ReadAsArray()

My numpy.__version__ gives me 1.6.1 and my whole setup is based on

Enthought's EPD.

Best regards,

Michael

_______________________________________________

NumPy-Discussion mailing list

http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________

NumPy-Discussion mailing list

NumPy-Discussion@scipy.org

http://mail.scipy.org/mailman/listinfo/numpy-discussion