[Python-ideas] solving multi-core Python

Nick Coghlan ncoghlan at gmail.com
Tue Jun 23 00:30:13 CEST 2015

On 23 Jun 2015 03:37, "Gregory P. Smith" <greg at krypto.org> wrote:
> On Sun, Jun 21, 2015 at 4:56 AM Devin Jeanpierre <jeanpierreda at gmail.com>
>> On Sat, Jun 20, 2015 at 4:16 PM, Eric Snow <ericsnowcurrently at gmail.com>
>> >
>> > On Jun 20, 2015 4:55 PM, "Devin Jeanpierre" <jeanpierreda at gmail.com>
>> >>
>> >> It's worthwhile to consider fork as an alternative.  IMO we'd get a
>> >> lot out of making forking safer, easier, and more efficient. (e.g.
>> >> respectively: adding an atfork registration mechanism; separating out
>> >> the bits of multiprocessing that use pickle from those that d, I
still disagreeon't;
>> >> moving the refcount to a separate page, or allowing it to be frozen
>> >> prior to a fork.)
>> >
>> > So leverage a common base of code with the multiprocessing module?
>> What is this question in response to? I don't understand.
>> > I would expect subinterpreters to use less memory.  Furthermore
>> > them would be significantly faster.  Passing objects between them
would be
>> > much more efficient.  And, yes, cross-platform.
>> Maybe I don't understand how subinterpreters work. AIUI, the whole
>> point of independent subinterpreters is that they share no state. So
>> if I have a web server, each independent serving thread has to do all
>> of the initialization (import HTTP libraries, etc.), right? Compare
>> with forking, where the initialization is all done and then you fork,
>> and you are immediately ready to serve, using the data structures
>> shared with all the other workers, which is only copied when it is
>> written to.
> Unfortunately CPython subinterpreters do share some state, though it is
not visible to the running code in many cases.  Thus the other mentions of
"wouldn't it be nice if CPython didn't assume a single global state per
process" (100% agreed, but tangential to this discussion)...
> https://docs.python.org/3/c-api/init.html#sub-interpreter-support
> You are correct that some things that could make sense to share, such as
imported modules, would not be shared as they are in a forked environment.
> This is an important oddity of subinterpreters: They have to re-import
everything other than extension modules. When you've got a big process with
a ton of modules (like, say, 100s of protocol buffers...), that's going to
be a non-starter (pun intended) for the use of threads+subinterpreters as a
fast form of concurrency if they need to import most of those from each
subinterpreter. startup latency and cpu usage += lots. (possibly uses more
memory as well but given our existing refcount implementation forcing
needless PyObject page writes during a read causing fork to
copy-on-write... impossible to guess)
> What this means for subinterpreters in this case is not much different
from starting up multiple worker processes: You need to start them up and
wait for them to be ready to serve, then reuse them as long as feasible
before recycling them to start up a new one. The startup cost is high.

While I don't believe it's clear from the current text in the PEP (mostly
because I only figured it out while hacking on the prototype
implementation), PEP 432 should actually give us much better control over
how subinterpreters are configured, as many more interpreter settings move
out of global variables and into the interpreter state:
https://www.python.org/dev/peps/pep-0432/ (the global variables will still
exist, but primarily as an input to the initial configuration of the main

The current state of that work can be seen at

While a lot of things are broken there, it's at least to the point where it
can start running the regression test suite under the new 2-phase
initialisation model.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150623/125305b0/attachment.html>

More information about the Python-ideas mailing list