[docs] [issue27422] Deadlock when mixing threading and multiprocessing

Davin Potts report at bugs.python.org
Mon Jul 4 11:19:42 EDT 2016

Davin Potts added the comment:

While I believe I understand the motivation behind the suggestion to detect when the code is doing something potentially dangerous, I'll point out a few things:
* any time you ask for a layer of convenience, you must choose something to sacrifice to get it (usually performance is sacrificed) and this sacrifice will affect all code (including non-problematic code)
* behind the scenes multiprocessing itself is employing multiple threads in the creation and coordination between processes -- "checking to see if there are multiple threads active on process creation" is therefore a more complicated request than it maybe first appears
* Regarding "python makes it very easy to mix these two", I'd say it's nearly as easy to mix the two in C code -- the common pattern across different languages is to learn the pros+cons+gotchyas of working with processes and threads

I too come from the world of scientific software and the mixing of Fortran, C/C++, and Python (yay science and yay Fortran) so I'll make another point (apologies if you already knew this):
There's a lot of computationally intensive code in scientific code/applications and being able to perform those computations in parallel is a wonderful thing.  I am unsure if the tests you're trying to speed up exercise compute-intensive functions but let's assume they do.  For reasons not described here, using the CPython implementation, there is a constraint on the use of threads that restricts them to all run on a single core of your multi-core cpu (and on only one cpu if you have an SMP system).  Hence spinning up threads to perform compute intensive tasks will likely result in no better throughput (no speedup) because they're all fighting over the same maxed-out core.  To spread out onto and take advantage of multiple cores (and multiple cpus on an SMP system) you will want switch to creating processes (as you say you now have).  I'd make the distinction that you are likely much more interested in 'parallel computing' than 'concurrent execution'.  Since you're already using multiprocessing you might also simply use `multiprocessing.Pool`.


Python tracker <report at bugs.python.org>

More information about the docs mailing list