On 1/28/07, Charles R Harris
The problem goes away if I remove atlas (atlas3-sse2 for me). But that just introduces another problem: slowness.
So is this something to report to Clint Whaley? Or does it have to do with how numpy uses atlas?
Interesting, I wonder if ATLAS is resetting the FPU flags and changing the rounding mode? It is just the LSB of the mantissa that looks to be changing. Before reporting the problem it might be good to pin it down a bit more if possible.
Well, the fact that I don't see the problem on a PentiumIII (with
atlas-sse) but I see it on my desktop (atlas-sse2) should tell us
something. The test code uses double arrays, and SSE2 has double
precision support but it's purely 64-bit doubles. SSE is
single-precision only, which means that for a double computation,
ATLAS isn't used and the Intel FPU does the computation instead.
Intel FPUs use 80 bits internally for intermediate operations (even
though they only return a normal 64-bit double result), so it's fairly
common to see this kind of thing.
You can test things by writing a little program in C that does the
same operations, and use this little trick:
#include