[Numpy-discussion] could anyone check on a 32bit system?

josef.pktd at gmail.com josef.pktd at gmail.com
Wed May 1 15:50:15 EDT 2013


On Wed, May 1, 2013 at 1:24 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> HI,
>
> On Wed, May 1, 2013 at 9:09 AM, Yaroslav Halchenko <lists at onerussian.com> wrote:
>>
>> On Wed, 01 May 2013, Nathaniel Smith wrote:
>>> > not sure there is anything to fix here. Third-party code relying on a
>>> > certain outcome of rounding error is likely incorrect anyway.
>>
>>> Yeah, seems to just be the standard floating point indeterminism.
>>> Using Matthew's numbers and pure Python floats:
>>
>>> In [9]: (0.49505185 + 0.53529587) + -0.13461665
>>> Out[9]: 0.89573107
>>
>>> In [10]: 0.49505185 + (0.53529587 + -0.13461665)
>>> Out[10]: 0.8957310700000001
>>
>>> In [11]: _9 - _10
>>> Out[11]: -1.1102230246251565e-16
>>
>>> Looks like a bug in pymvpa or its test suite to me.
>>
>> well -- sure thing we will "fix" the unittest to not rely on precise
>> correspondence any longer since released 1.7.1 is effected.  So it is not
>> a matter of me avoiding "fixing" pymvpa's "bug".
>>
>> I brought it to your attention because
>>
>> 1. from e.g.
>>
>> np.sum(data[:, 0]) - np.sum(data, axis=0)[0]
>>
>> which presumably should be the same order of additions for 0-th column it is
>> not clear that they do not have to be identical.
>
> I agree it's surprising, but I guess it's reasonable for numpy to
> reserve the right to add these guys up in whatever order it chooses,
> and (in this case) maybe a different order for the axis=None, axis=X
> cases.
>
> Also, y'all may have noticed that it is the presence of the second
> vector in the array which causes the difference in the sums of the
> first (see my first email in this thread).   If this is an order
> effect I guess this means that the order of operations in an sum(a,
> axis=X) operation depends on the shape of the array.   And it looks
> like it depends on memory layout:
>
> In [24]: data = np.array([[ 0.49505185,  0],
>    ....:        [ 0.53529587,  0],
>    ....:        [-0.13461665, 0]])
>
> In [25]: np.sum(data[:, 0]) - np.sum(data, axis=0)[0]
> Out[25]: 1.1102230246251565e-16
>
> In [26]: data_F = np.array(data, order='F')
>
> In [27]: np.sum(data_F[:, 0]) - np.sum(data_F, axis=0)[0]
> Out[27]: 0.0
>
> Do we allow the results to be different for different memory layout?

Wasn't this the point of some of Mark Wiebe's optimization?

As far as I understand he got rid of some of the C bias, and made
calculations faster for Fortran contiguous arrays.

I rather have speed for my Fortran arrays, then relying on float
precision issues, that bite anyway when running on many different
kinds
of machines. (I usually have to lower some unit test precision during
Debian testing of statsmodels.)

Josef

>
>> 2. so far they were identical across many numpy releases
>>
>> 3. they are identical on other architectures (e.g. amd64)
>
> To me that is surprising.  I would have guessed that the order is the
> same on 32 and 64 bit, but something about the precision of
> intermediate operations is different.   I don't know enough about
> amd64 to guess what that could be.  Bradley's suggestion seems kind of
> reasonable but it's strange that numpy should use intel-80 bit
> intermediate values differently for 32 and 64 bit.
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list