[Python-ideas] solving multi-core Python

Tue Jun 23 01:59:40 CEST 2015

On Mon, Jun 22, 2015 at 10:37 AM, Gregory P. Smith <greg at krypto.org> wrote:
> This is an important oddity of subinterpreters: They have to re-import
> everything other than extension modules. When you've got a big process with
> a ton of modules (like, say, 100s of protocol buffers...), that's going to
> be a non-starter (pun intended) for the use of threads+subinterpreters as a
> fast form of concurrency if they need to import most of those from each
> subinterpreter. startup latency and cpu usage += lots. (possibly uses more
> memory as well but given our existing refcount implementation forcing
> needless PyObject page writes during a read causing fork to copy-on-write...
> impossible to guess)
>
> What this means for subinterpreters in this case is not much different from
> starting up multiple worker processes: You need to start them up and wait
> for them to be ready to serve, then reuse them as long as feasible before
> recycling them to start up a new one. The startup cost is high.

One possibility would be for subinterpreters to copy modules from the
main interpreter -- I guess your average module is mostly dicts,
strings, type objects, and functions; strings and functions are
already immutable and could be shared without copying, and I guess
copying the dicts and type objects into the subinterpreter is much
cheaper than hitting the disk etc. to do a real import. (Though
certainly not free.)

This would have interesting semantic implications -- it would give
similar effects to fork(), with subinterpreters starting from a
snapshot of the main interpreter's global state.

> I'm not entirely sold on this overall proposal, but I think a result of it
> could be to make our subinterpreter support better which would be a good
> thing.
>
> We have had to turn people away from subinterpreters in the past for use as
> part of their multithreaded C++ server where they wanted to occasionally run
> some Python code in embedded interpreters as part of serving some requests.
> Doing that would suddenly single thread their application (GIIIIIIL!) for
> all requests currently executing Python code despite multiple
> subinterpreters.

I've also talked to HPC users who discovered this problem the hard way
(e.g. http://www-atlas.lbl.gov/, folks working on the Large Hadron
Collider) -- they've been using Python as an extension language in
some large physics codes but are now porting those bits to C++ because
of the GIL issues. (In this context startup overhead should be easily
amortized, but switching to an RPC model is not going to happen.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org