[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Sat Mar 22 00:53:31 EDT 2008
Anne Archibald wrote:
>
> There was some discussion of this recently. The most direct approach
> to the problem is to annotate some or all of numpy's inner C loops
> with OpenMP constructs, then provide some python functions to control
> the degree of parallelism OpenMP uses. This would transparently
> provide parallelism for many numpy operations, including sum(),
> numpy's version of IDL's total(). All that is needed is for someone to
> implement it. Nobody has stepped forward yet.
I am not really familiar with openMP (only played with it on toy
problems). From a built point of view, are the problems I could see
without knowing anything:
- compiler support: at source code level, open mp works only through
pragma, right ? So we will get warning for compilers not supporting
openmp if we just use pragam as is (this could be solved with macro I
guess).
- compiler flags and link flags: at least gcc needs flags for
compilation and linking code with open mp. This means detecting whether
the compiler supports it.
This does not sound too bad, but this needs to work reliably on all
supported platforms. Of course, I can add this to numscons; adding it to
distutils would be a bit more work, but I can do it too if someone else
is willing to do the actual coding in the C sources.
Now, the main concern I would have is the effectiveness of all this on
simple operations. I note that matlab 2007a, while claiming support for
multi-core, does not use multi-core for simple operations, only for FFT,
BLAS and LAPACK (where this should be possible right now if e.g. using
Intel MKL, am I right ?). Matlab 7.6 supports also things like
element-wise computation (a = sin(b))
http://www.mathworks.com/products/matlab/demos.html?file=/products/demos/matlab/multithreadedcomputations/multithreadedcomputations.html
Personally, I am wondering whether it would not be more worthwhile to
think first about sse and co, because it can give the same order of
increase in speed, without all the problems linked to multi-threading
(slower in mono-thread case, in particular).
cheers,
David
More information about the NumPy-Discussion
mailing list