[Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver

Daπid davidmenhur at gmail.com
Thu Oct 1 02:54:14 EDT 2015


On 30 September 2015 at 18:20, Nathaniel Smith <njs at pobox.com> wrote:

> On Sep 30, 2015 2:28 AM, "Daπid" <davidmenhur at gmail.com> wrote:
> [...]
> > Is there a nice way to ship both versions? After all, most
> implementations of BLAS and friends do spawn OpenMP threads, so I don't
> think it would be outrageous to take advantage of it in more places;
> provided there is a nice way to fallback to a serial version when it is not
> available.
>
> This is incorrect -- the only common implementation of BLAS that uses
> *OpenMP* threads is OpenBLAS, and even then it's not the default -- it only
> happens if you run it in a special non-default configuration.
>
Right, sorry. I wanted to say they spawn parallel threads. What do you mean
by a non default configuration? Setting he OMP_NUM_THREADS?

> The challenges to providing transparent multithreading in numpy generally
> are:
>
> - gcc + OpenMP on linux still breaks multiprocessing. There's a patch to
> fix this but they still haven't applied it; alternatively there's a
> workaround you can use in multiprocessing (not using fork mode), but this
> requires every user update their code and the workaround has other
> limitations. We're unlikely to use OpenMP while this is the case.
>
Any idea when is this going to be released?

As I understand it, OpenBLAS doesn't have this problem, am I right?

> - parallel code in general is not very composable. If someone is calling a
> numpy operation from one thread, great, transparently using multiple
> threads internally is a win. If they're exploiting some higher-level
> structure in their problem to break it into pieces and process each in
> parallel, and then using numpy on each piece, then numpy spawning threads
> internally will probably destroy performance. And numpy is too low-level to
> know which case it's in. This problem exists to some extent already with
> multi-threaded BLAS, so people use various BLAS-specific knobs to manage it
> in ad hoc ways, but this doesn't scale.
>
> (Ironically OpenMP is more composable then most approaches to threading,
> but only if everyone is using it and, as per above, not everyone is and we
> currently can't.)
>
That is what I meant with providing also a single threaded version.
<https://mail.scipy.org/mailman/listinfo/numpy-discussion>The user can
choose if they want the parallel or the serial, depending on the case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151001/3dce1928/attachment.html>


More information about the NumPy-Discussion mailing list