On Jan 24, 2012, at 1:33 PM, K.-Michael Aye wrote:

I know I know, that's pretty outrageous to even suggest, but please bear with me, I am stumped as you may be:

2-D data file here: http://dl.dropbox.com/u/139035/data.npy

Then: In [3]: data.mean() Out[3]: 3067.0243839999998

In [4]: data.max() Out[4]: 3052.4343

In [5]: data.shape Out[5]: (1000, 1000)

In [6]: data.min() Out[6]: 3040.498

In [7]: data.dtype Out[7]: dtype('float32')

A mean value calculated per loop over the data gives me 3045.747251076416 I first thought I still misunderstand how data.mean() works, per axis and so on, but did the same with a flattenend version with the same results.

Am I really soo tired that I can't see what I am doing wrong here? For completion, the data was read by a osgeo.gdal dataset method called ReadAsArray() My numpy.__version__ gives me 1.6.1 and my whole setup is based on Enthought's EPD.

I get the same result:

In [1]: import numpy

In [2]: data = numpy.load('data.npy')

In [3]: data.mean() Out[3]: 3067.0243839999998

In [4]: data.max() Out[4]: 3052.4343

In [5]: data.min() Out[5]: 3040.498

In [6]: numpy.version.version Out[6]: '2.0.0.dev-433b02a'

This on OS X 10.7.2 with Python 2.7.1, on an intel Core i7. Running python as a 32 vs. 64-bit process doesn't make a difference.

The data matrix doesn't look too strange when I view it as an image -- all pretty smooth variation around the (min, max) range. But maybe it's still somehow floating-point pathological?

This is fun too: In [12]: data.mean() Out[12]: 3067.0243839999998

In [13]: (data/3000).mean()*3000 Out[13]: 3020.8074375000001

In [15]: (data/2).mean()*2 Out[15]: 3067.0243839999998

In [16]: (data/200).mean()*200 Out[16]: 3013.6754000000001

Zach