[Numpy-discussion] large float32 array issue

Wed Nov 3 14:38:35 EDT 2010

Vincent Schut :
> Hi, I'm running in this strange issue when using some pretty large 
> float32 arrays. In the following code I create a large array filled with 
> ones, and calculate mean and sum, first with a float64 version, then 
> with a float32 version. Note the difference between the two. NB the 
> float64 version is obviously right :-)
>
>
>
> In [2]: areaGrid = numpy.ones((11334, 16002))
> In [3]: print(areaGrid.dtype)
> float64
> In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(), 
> areaGrid.mean(), areaGrid.sum())
> ((11334, 16002), 1.0, 1.0, 1.0, 181366668.0)
>
>
> In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
> In [6]: print(areaGrid.dtype)
> float32
> In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(), 
> areaGrid.mean(), areaGrid.sum())
> ((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)
>
>   
Yes I also got the same problem.
b=npy.ones((11334,16002),dtype='float32')
 >>> a.shape[0]*a.shape[1]
181366668L
 >>> b.sum()
16777216.0
 >>> print npy.finfo(b.dtype).max
3.40282e+38
Acumulator size is definitely not the problem.
I think the float point accuracy actually kicked in.
try following code:
npy.float32(16777216)+npy.float32(1)
You will see the number will not grow any more
it is because eps(npy.float32(16777216)) = 2 >1
That is why u cannot accumulate with 1 or smaller number beyound this 
value.
try:
npy.float32(16777215)+npy.float32(0.5)
and:
npy.float64(1e16)+npy.float64(1)
You also cannot get bigger number by accumulation anymore
The numpy.sum() is simply clumsy in this aspect. It try to simply 
accumulate all the value together, which should always be avoided for 
float point value, even with float64 number. Think about add 1e12 with 
1e16 values smaller than 0.0001, it will give u 1.0e12, instead of  
2e12. Some one try to do smarter things like:
1) put all small value into a group, all big value into another group
2) obtain sum values respectively
3) add the sum values together
But it is costy I guess
> Can anybody confirm this? And better: explain it? Am I running into a 
> for me till now hidden ieee float 'feature'? Or is it a bug somewhere?
>
> Btw I'd like to use float32 arrays, as precision is not really an issue 
> in this case, but memory usage is...
>
>
> This is using python 2.7, numpy from git (yesterday's checkout), on arch 
> linux 64bit.
>
> Best,
> Vincent.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>