[Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver
Juha Jeronen
juha.jeronen at jyu.fi
Fri Oct 2 05:58:49 EDT 2015
On 01.10.2015 03:52, Sturla Molden wrote:
> On 01/10/15 02:32, Juha Jeronen wrote:
>
>> Sounds good. Out of curiosity, are there any standard fork-safe
>> threadpools, or would this imply rolling our own?
>
> You have to roll your own.
>
> Basically use pthreads_atfork to register a callback that shuts down
> the threadpool before a fork and another callback that restarts it.
> Python's threading module does not expose the pthreads_atfork
> function, so you must call it from Cython.
>
> I believe there is also a tiny atfork module in PyPI.
Ok. Thanks. This approach fixes the issue of the threads not being there
for the child process.
I think it still leaves open the issue of creating the correct number of
threads in the pools for each of the processes when the pool is
restarted (so that in total there will be as many threads as cores
(physical or virtual, whichever the user desires)). But this is again
something that requires context...
>> So maybe it would be better, at least at first, to make a pure-Cython
>> version with no attempt at multithreading?
>
> I would start by making a pure Cython version that works correctly.
> The next step would be to ensure that it releases the GIL. After that
> you can worry about parallel processing, or just tell the user to use
> threads or joblib.
First version done and uploaded:
https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy
OpenMP support removed; this version uses only Cython.
The example program has been renamed to main.py, and setup.py has been
cleaned, removing the irrelevant module.
This folder contains only the files for the polynomial solver.
As I suspected, removing OpenMP support only required changing a few
lines, and dropping the import for Cython.parallel. The "prange"s have
been replaced with "with nogil" and "range".
Note that both the original version and this version release the GIL
when running the processing loops.
It may be better to leave this single-threaded for now. Using Python
threads isn't that difficult and joblib sounds nice, too.
What's the next step?
-J
More information about the NumPy-Discussion
mailing list