On Jan 24, 2012, at 1:33 PM, K.-Michael Aye wrote:
I know I know, that's pretty outrageous to even suggest, but please bear with me, I am stumped as you may be:
2-D data file here: http://dl.dropbox.com/u/139035/data.npy
Then: In [3]: data.mean() Out[3]: 3067.0243839999998
In [4]: data.max() Out[4]: 3052.4343
In [5]: data.shape Out[5]: (1000, 1000)
In [6]: data.min() Out[6]: 3040.498
In [7]: data.dtype Out[7]: dtype('float32')
A mean value calculated per loop over the data gives me 3045.747251076416 I first thought I still misunderstand how data.mean() works, per axis and so on, but did the same with a flattenend version with the same results.
Am I really soo tired that I can't see what I am doing wrong here? For completion, the data was read by a osgeo.gdal dataset method called ReadAsArray() My numpy.__version__ gives me 1.6.1 and my whole setup is based on Enthought's EPD.
I get the same result: In [1]: import numpy In [2]: data = numpy.load('data.npy') In [3]: data.mean() Out[3]: 3067.0243839999998 In [4]: data.max() Out[4]: 3052.4343 In [5]: data.min() Out[5]: 3040.498 In [6]: numpy.version.version Out[6]: '2.0.0.dev-433b02a' This on OS X 10.7.2 with Python 2.7.1, on an intel Core i7. Running python as a 32 vs. 64-bit process doesn't make a difference. The data matrix doesn't look too strange when I view it as an image -- all pretty smooth variation around the (min, max) range. But maybe it's still somehow floating-point pathological? This is fun too: In [12]: data.mean() Out[12]: 3067.0243839999998 In [13]: (data/3000).mean()*3000 Out[13]: 3020.8074375000001 In [15]: (data/2).mean()*2 Out[15]: 3067.0243839999998 In [16]: (data/200).mean()*200 Out[16]: 3013.6754000000001 Zach