On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:

On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
> On 1/28/07, Fernando Perez <fperez.net@gmail.com> wrote:
> > On 1/28/07, Keith Goodman < kwgoodman@gmail.com> wrote:
> > > On 1/28/07, Fernando Perez <fperez.net@gmail.com> wrote:
> > > > [snip]  The test code uses double arrays, and SSE2 has double
> > > > precision support but it's purely 64-bit doubles.  SSE is
> > > > single-precision only, which means that for a double computation,
> > > > ATLAS isn't used and the Intel FPU does the computation instead.
> > >
> > > So since I use N.float64, ATLAS SSE won't help me?
> >
> > Well, the SSE part won't, but you're still better off with ATLAS than
> > with the default reference BLAS implementation.  I think even an ATLAS
> > SSE has special code for double (not using any SSE-type engine) that's
> > faster than the reference BLAS which is pure generic Fortran.  Someone
> > who knows the ATLAS internals please correct me if that's not the
> > case.
>
> That makes sense.
>
> Unfortunately my simulation gives different results with and without
> ATLAS SSE even though the test script I made doesn't detect the
> difference.

ATLAS BASE (no SSE or SSE2) also gives me different simulations
results even though it passes the test script.

Hmmm, I wonder if stuff could be done in different orders. That could affect rounding. Even optimization settings could if someone wasn't careful to use parenthesis to force the order of evaluation. This is all very interesting.

Chuck