On 1/28/07, Fernando Perez <fperez.net@gmail.com> wrote:

On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:

> > The problem goes away if I remove atlas (atlas3-sse2 for me). But that
> > just introduces another problem: slowness.
> >
> > So is this something to report to Clint Whaley? Or does it have to do
> > with how numpy uses atlas?
>
>
> Interesting, I wonder if ATLAS is resetting the FPU flags and changing the
> rounding mode? It is just the LSB of the mantissa that  looks to be
> changing. Before reporting the problem it might be good to pin it down a bit
> more if possible.

Well, the fact that I don't see the problem on a PentiumIII (with
atlas-sse) but I see it on my desktop (atlas-sse2) should tell us
something.  The test code uses double arrays, and SSE2 has double
precision support but it's purely 64-bit doubles.  SSE is
single-precision only, which means that for a double computation,
ATLAS isn't used and the Intel FPU does the computation instead.
Intel FPUs use 80 bits internally for intermediate operations (even
though they only return a normal 64-bit double result), so it's fairly
common to see this kind of thing.

But how come it isn't consistent and seems to depend on timing? That is what makes me think there is a race somewhere in doing something, like setting flags . I googled yesterday for floating point errors and didn't find anything that looked relevant. Maybe I should try again with the combination of atlas and sse2.

Chuck