[Numpy-discussion] large float32 array issue

Wed Nov 3 07:31:31 EDT 2010

On Wed, Nov 3, 2010 at 5:59 AM, Warren Weckesser <
warren.weckesser at enthought.com> wrote:

>
>
> On Wed, Nov 3, 2010 at 3:54 AM, Vincent Schut <schut at sarvision.nl> wrote:
>
>> Hi, I'm running in this strange issue when using some pretty large
>> float32 arrays. In the following code I create a large array filled with
>> ones, and calculate mean and sum, first with a float64 version, then
>> with a float32 version. Note the difference between the two. NB the
>> float64 version is obviously right :-)
>>
>>
>>
>> In [2]: areaGrid = numpy.ones((11334, 16002))
>> In [3]: print(areaGrid.dtype)
>> float64
>> In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
>> areaGrid.mean(), areaGrid.sum())
>> ((11334, 16002), 1.0, 1.0, 1.0, 181366668.0)
>>
>>
>> In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
>> In [6]: print(areaGrid.dtype)
>> float32
>> In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
>> areaGrid.mean(), areaGrid.sum())
>> ((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)
>>
>>
>> Can anybody confirm this? And better: explain it? Am I running into a
>> for me till now hidden ieee float 'feature'? Or is it a bug somewhere?
>>
>> Btw I'd like to use float32 arrays, as precision is not really an issue
>> in this case, but memory usage is...
>>
>>
>> This is using python 2.7, numpy from git (yesterday's checkout), on arch
>> linux 64bit.
>>
>>
>
> The problem kicks in with an array of ones of size 2**24.  Note that
> np.float32(2**24) + np.float32(1.0) equals np.float32(2**24):
>
>
> In [41]: b = np.ones(2**24, np.float32)
>
> In [42]: b.size, b.sum()
> Out[42]: (16777216, 16777216.0)
>
> In [43]: b = np.ones(2**24+1, np.float32)
>
> In [44]: b.size, b.sum()
> Out[44]: (16777217, 16777216.0)
>
> In [45]: np.spacing(np.float32(2**24))
> Out[45]: 2.0
>
> In [46]: np.float32(2**24) + np.float32(1)
> Out[46]: 16777216.0
>
>
>

By the way, you can override the dtype of the accumulator of the mean()
function:

In [61]: a = np.ones((11334,16002),np.float32)

In [62]: a.mean()  # Not correct
Out[62]: 0.092504406598019437

In [63]: a.mean(dtype=np.float64)
Out[63]: 1.0

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101103/589d8a54/attachment.html>