[Numpy-discussion] Bug in logaddexp2.reduce

Thu Apr 1 02:46:20 EDT 2010

On 1 April 2010 02:24, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Apr 1, 2010 at 12:04 AM, Anne Archibald <peridot.faceted at gmail.com>
> wrote:
>>
>> On 1 April 2010 01:59, Charles R Harris <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Wed, Mar 31, 2010 at 11:46 PM, Anne Archibald
>> > <peridot.faceted at gmail.com>
>> > wrote:
>> >>
>> >> On 1 April 2010 01:40, Charles R Harris <charlesr.harris at gmail.com>
>> >> wrote:
>> >> >
>> >> >
>> >> > On Wed, Mar 31, 2010 at 11:25 PM, <josef.pktd at gmail.com> wrote:
>> >> >>
>> >> >> On Thu, Apr 1, 2010 at 1:22 AM,  <josef.pktd at gmail.com> wrote:
>> >> >> > On Thu, Apr 1, 2010 at 1:17 AM, Charles R Harris
>> >> >> > <charlesr.harris at gmail.com> wrote:
>> >> >> >>
>> >> >> >>
>> >> >> >> On Wed, Mar 31, 2010 at 6:08 PM, <josef.pktd at gmail.com> wrote:
>> >> >> >>>
>> >> >> >>> On Wed, Mar 31, 2010 at 7:37 PM, Warren Weckesser
>> >> >> >>> <warren.weckesser at enthought.com> wrote:
>> >> >> >>> > T J wrote:
>> >> >> >>> >> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris
>> >> >> >>> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >>> >>
>> >> >> >>> >>> Looks like roundoff error.
>> >> >> >>> >>>
>> >> >> >>> >>>
>> >> >> >>> >>
>> >> >> >>> >> So this is "expected" behavior?
>> >> >> >>> >>
>> >> >> >>> >> In [1]: np.logaddexp2(-1.5849625007211563,
>> >> >> >>> >> -53.584962500721154)
>> >> >> >>> >> Out[1]: -1.5849625007211561
>> >> >> >>> >>
>> >> >> >>> >> In [2]: np.logaddexp2(-0.5849625007211563,
>> >> >> >>> >> -53.584962500721154)
>> >> >> >>> >> Out[2]: nan
>> >> >> >>> >>
>> >> >> >>> >
>> >> >> >>> > Is any able to reproduce this?  I don't get 'nan' in either
>> >> >> >>> > 1.4.0
>> >> >> >>> > or
>> >> >> >>> > 2.0.0.dev8313 (32 bit Mac OSX).  In an earlier email T J
>> >> >> >>> > reported
>> >> >> >>> > using
>> >> >> >>> > 1.5.0.dev8106.
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> >>> np.logaddexp2(-0.5849625007211563, -53.584962500721154)
>> >> >> >>> nan
>> >> >> >>> >>> np.logaddexp2(-1.5849625007211563, -53.584962500721154)
>> >> >> >>> -1.5849625007211561
>> >> >> >>>
>> >> >> >>> >>> np.version.version
>> >> >> >>> '1.4.0'
>> >> >> >>>
>> >> >> >>> WindowsXP 32
>> >> >> >>>
>> >> >> >>
>> >> >> >> What compiler? Mingw?
>> >> >> >
>> >> >> > yes, mingw 3.4.5. , official binaries release 1.4.0 by David
>> >> >>
>> >> >> sse2 Pentium M
>> >> >>
>> >> >
>> >> > Can you try the exp2/log2 functions with the problem data and see if
>> >> > something goes wrong?
>> >>
>> >> Works fine for me.
>> >>
>> >> If it helps clarify things, the difference between the two problem
>> >> input values is exactly 53 (and that's what logaddexp2 does an exp2
>> >> of); so I can provide a simpler example:
>> >>
>> >> In [23]: np.logaddexp2(0, -53)
>> >> Out[23]: nan
>> >>
>> >> Of course, for me it fails in both orders.
>> >>
>> >
>> > Ah, that's progress then ;) The effective number of bits in a double is
>> > 53
>> > (52 + implicit bit). That shouldn't cause problems but it sure looks
>> > suspicious.
>>
>> Indeed, that's what led me to the totally wrong suspicion that
>> denormals have something to do with the problem. More data points:
>>
>> In [38]: np.logaddexp2(63.999, 0)
>> Out[38]: nan
>>
>> In [39]: np.logaddexp2(64, 0)
>> Out[39]: 64.0
>>
>> In [42]: np.logaddexp2(52.999, 0)
>> Out[42]: 52.999000000000002
>>
>> In [43]: np.logaddexp2(53, 0)
>> Out[43]: nan
>>
>> It looks to me like perhaps the NaNs are appearing when the smaller
>> term affects only the "extra" bits provided by the FPU's internal
>> larger-than-double representation. Some such issue would explain why
>> the problem seems to be hardware- and compiler-dependent.
>>
>
> Hmm, that 63.999 is kinda strange. Here is a bit of code that might confuse
> a compiler working with different size mantissas:
>
> @type@ npy_log2_1p at c@(@type@ x)
> {
>     @type@ u = 1 + x;
>     if (u == 1) {
>         return LOG2E*x;
>     } else {
>         return npy_log2 at c@(u) * x / (u - 1);
>     }
> }
>
> It might be that u != 1 does not imply u-1 != 0.

That does indeed look highly suspicious. I'm not entirely sure how to
work around it. GSL uses a volatile declaration:
http://www.google.ca/codesearch/p?hl=en#p9nGS4eQGUI/gnu/gsl/gsl-1.8.tar.gz%7C8VCQSLJ5jR8/gsl-1.8/sys/log1p.c&q=log1p
On the other hand boost declares itself defeated by optimizing
compilers and uses a Taylor series:
http://www.google.ca/codesearch/p?hl=en#sdP2GRSfgKo/dcplusplus/trunk/boost/boost/math/special_functions/log1p.hpp&q=log1p&sa=N&cd=7&ct=rc
While R makes no mention of the corrected formula or optimizing
compilers but takes the same approach, only with Chebyshev series:
http://www.google.ca/codesearch/p?hl=en#gBBSWbwZmuk/src/base/R-2/R-2.3.1.tar.gz%7CVuh8XhRbUi8/R-2.3.1/src/nmath/log1p.c&q=log1p

Since, at least on my machine, ordinary log1p appears to work fine, is
there any reason not to have log2_1p call it and scale the result? Or
does the compiler make a hash of our log1p too?

Anne

> Chuck
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>