[Numpy-discussion] help using np.correlate to produce correlograms.
pierre.haessig at crans.org
Thu Dec 11 09:49:35 EST 2014
Le 11/12/2014 15:39, Julian Taylor a écrit :
> previously numpy called dot for the convolution part, this is fine for
> large convolutions as dot goes out to BLAS which is superfast.
> For small convolutions unfortunately it is terrible as generic dot in
> BLAS libraries have enormous overheads they only amortize on large data.
> So one part was computing the dot in a simple numpy internal loop if the
> data is small.
> The second part is the number of registers typical machines have, e.g.
> amd64 has 16 floating point registers. If you can put all elements of a
> convolution kernel into these registers you save reloading them from
> stack on each iteration.
> 11 is the largest number I could reliably use without the compiler
> spilling them to the stack.
More information about the NumPy-Discussion