[Numpy-discussion] -ffast-math

Dan Goodman dg.gmane at thesamovar.net
Sun Dec 1 15:53:14 EST 2013

Julian Taylor <jtaylor.debian <at> googlemail.com> writes:
> can you show the code that is slow in numpy?
> which version of gcc and libc are you using?
> with gcc 4.8 it uses the glibc 2.17 sin/cos with fast-math, so there
> should be no difference.

In trying to write some simple code to demonstrate it, I realised it was
weirdly more complicated than I thought. Previously I had been comparing
numpy against weave on a complicated expression, namely a*sin(2.0*freq*pi*t)
+ b + v*exp(-dt/tau) + (-a*sin(2.0*freq*pi*t) - b)*exp(-dt/tau). Doing that
with weave and no -ffast-math took the same time as numpy approximately, but
with weave and -ffast-math it was about 30x faster. Here only a and v are
arrays. Since numpy and weave with no -ffast-math took about the same time I
assumed it wasn't memory bound but to do with the -ffast-math.

Here's the demo code (you might need to comment a couple of lines out if you
want to actually run it, since it also tests a couple of things that depend
on a library):


However, when I did a simple example that just computed y=sin(x) for arrays
x and y, I found that numpy and weave without -ffast-math took about the
same time, but weave with -ffast-math was significantly slower than numpy!
My take home message from this: optimisation is weird. Could it be that
-ffast-math and -O3 allow SSE instructions and that there is some overhead
to this that makes it worth it for a complex expression but not for a simple

Here's the code for the simple example (doesn't have any dependencies):


For reference, I'm on a newish 64 bit windows machine running 32 bit Python
2.7.3, gcc version 4.5.2, numpy 1.8.0 installed from binaries.


More information about the NumPy-Discussion mailing list