bug ! arr.mean() outside arr.min() .. arr.max() range

Hi! b is a non-native byteorder array of type int16 but see further down: same after converting to native ...
repr(b.dtype) 'dtype('>i2')' b.dtype.isnative False b.shape (38, 512, 512)
b.max() 1279 b.min() 0 b.mean() -65.279878014 U.mmms(b) # my "useful" function for min,max,mean,stddev (0, 1279, 365.878016723, 123.112379036)
c = b.copy() c.max() 1279 c.min() 0 c.mean() -65.279878014 d = N.asarray(b, b.dtype.newbyteorder('=')) d.dtype.isnative True
d.max() 1279 d.min() 0 d.mean() -65.279878014 N.__version__ '1.0b2.dev2996'
Sorry that I don't have a simple example - what could be wrong !? - Sebastian Haase

Sebastian Haase wrote:
Hi! b is a non-native byteorder array of type int16 but see further down: same after converting to native ...
repr(b.dtype)
'dtype('>i2')'
The problem is no-doubt related to "wrapping" for integers. Your total is getting too large to fit into the reducing data-type. What does d.sum() give you? You can add d.mean(dtype='d') to force reduction over doubles. -Travis

Travis Oliphant wrote:
Sebastian Haase wrote:
Hi! b is a non-native byteorder array of type int16 but see further down: same after converting to native ...
repr(b.dtype)
'dtype('>i2')'
The problem is no-doubt related to "wrapping" for integers. Your total is getting too large to fit into the reducing data-type.
What does
d.sum() give you? I can't check that particular array until Monday...
You can add d.mean(dtype='d') to force reduction over doubles.
This almost sound like what I reported is something like a feature !? Is there a sensible / generic way to avoid those "accident" ? Maybe it must be the default to reduce int8, uint8, int16, uint16 into doubles !? - Sebastian

On 8/11/06, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Travis Oliphant wrote:
Sebastian Haase wrote:
Hi! b is a non-native byteorder array of type int16 but see further down: same after converting to native ...
repr(b.dtype)
'dtype('>i2')'
The problem is no-doubt related to "wrapping" for integers. Your total is getting too large to fit into the reducing data-type.
What does
d.sum() give you? I can't check that particular array until Monday...
You can add d.mean(dtype='d') to force reduction over doubles.
This almost sound like what I reported is something like a feature !? Is there a sensible / generic way to avoid those "accident" ? Maybe it must be the default to reduce int8, uint8, int16, uint16 into doubles !?
Hard to say. I always bear the precision in mind when accumulating numbers but even so it is possible to get unexpected results. Even doubles can give problems if there are a few large numbers mixed with many small numbers. That said, folks probably expect means to be accurate and don't want modular arithmetic, so doubles would probably be a better default. It would be slower though. I think there was a discussion of this problem previously in regard to the reduce methods. Chuck
participants (3)
-
Charles R Harris
-
Sebastian Haase
-
Travis Oliphant