[Numpy-discussion] adding fused multiply and add to numpy

Julian Taylor jtaylor.debian at googlemail.com
Thu Jan 9 20:06:01 EST 2014

On 10.01.2014 01:49, Frédéric Bastien wrote:
> Do you know if those instruction are automatically used by gcc if we
> use the good architecture parameter?

they are used if you enable -ffp-contract=fast. Do not set it to `on`
this is an alias to `off` due to the semantics of C.
-ffast-math enables in in gcc 4.7 and 4.8 but not in 4.9 but this might
be a bug, I filed one a while ago.

Also you need to set the -mfma or -arch=bdver{1,2,3,4}. Its not part of
-mavx2 last I checked.

But there are not many places in numpy the compiler can use it, only dot
comes to mind which goes over blas libraries in the high performance case.

