I am interested in adding SSE optimizations to numpy, where should I start? Fode
On 7/10/12 5:07 PM, Fode wrote:
I am interested in adding SSE optimizations to numpy, where should I start?
Well, to my knowledge there is not many open source code (Intel MKL and AMD ACML do not enter in this section) that uses the SSE, but a good start could be: http://gruntthepeon.free.fr/ssemath/ I'd say that NumPy could benefit a lot of integrating optimized versions for transcendental functions (as the link above). Good luck! -- Francesc Alted
Some more context over what Francesc said:
If you mean using SSE for simple things like addition and multiplication, then you must be aware that NumPy's way of working means that it lends itself very badly to such optimizations. For small arrays, the Python interpreter overhead tends to dominate and for large arrays it's all about memory bus speef.
There's a video online of a talk Francesc gave at PyData this year that explains this and the current options.
People are working on it (e.g. right now in Numba and Cython) and down the road perhaps NumPy 3.0 or 4.0 can have better performance. But it's a pretty complicated work, it'd be difficult to dive in without learning more first.
(Mark Florisson is currently working on a library that is reusable across projects which will bring SSE/vectorization to Cython (it beats Intel Fortran in some benchmarks! :-))
Dag
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Fode
participants (3)
-
Dag Sverre Seljebotn
-
Fode
-
Francesc Alted