I have confirmed this on a 64-bit linux machine running python 2.7.2 with the development version of numpy. It seems to be related to using float32 instead of float64. If the array is first converted to a 64-bit float (via astype), mean gives an answer that agrees with your looped-calculation value: 3045.7472500000002. With the original 32-bit array, averaging successively on one axis and then on the other gives answers that agree with the 64-bit float answer to the second decimal place.

In [125]: d = np.load('data.npy')

In [126]: d.mean() Out[126]: 3067.0243839999998

In [127]: d64 = d.astype('float64')

In [128]: d64.mean() Out[128]: 3045.747251076416

In [129]: d.mean(axis=0).mean() Out[129]: 3045.7487500000002

In [130]: d.mean(axis=1).mean() Out[130]: 3045.7444999999998

In [131]: np.version.full_version Out[131]: '2.0.0.dev-55472ca'

-- On Tue, 2012-01-24 at 12:33 -0600, K.-MichaelA wrote:

I know I know, that's pretty outrageous to even suggest, but please bear with me, I am stumped as you may be:

2-D data file here: http://dl.dropbox.com/u/139035/data.npy

Then: In [3]: data.mean() Out[3]: 3067.0243839999998

In [4]: data.max() Out[4]: 3052.4343

In [5]: data.shape Out[5]: (1000, 1000)

In [6]: data.min() Out[6]: 3040.498

In [7]: data.dtype Out[7]: dtype('float32')

A mean value calculated per loop over the data gives me 3045.747251076416 I first thought I still misunderstand how data.mean() works, per axis and so on, but did the same with a flattenend version with the same results.

Am I really soo tired that I can't see what I am doing wrong here? For completion, the data was read by a osgeo.gdal dataset method called ReadAsArray() My numpy.__version__ gives me 1.6.1 and my whole setup is based on Enthought's EPD.

Best regards, Michael

NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion