Re: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib.

27 May 2017

      Hi Nick,
...
I guess I'll have to scale back my hopes on that front to be closer to
what Stephan described - even a deep copy equivalent is often going to
be cheaper than a full serialise/transmit/deserialise cycle or some
other form of inter-process communication.
I would like to add that in many cases the underlying C objects *could* be
shared.
I identified some possible use cases of this.

1. numpy/scipy: share underlying memory of ndarray
  Effectively threads can then operate on the same array without GIL
interference.

2. Sqlite in-memory database
  Multiple threads can operate on it in parallel.
  If you have an ORM it might feel very similar to just sharing Python
objects across threads.

3. Tree of XML elements (like xml.etree)
   Assuming the tree data structure itself is in C, the tree could be
shared across
   interpreters. This would be an example of a "deep" datastructure which can
   still be efficiently shared.

So I feel this could still be very useful even if pure-Python objects
need to be copied.

Thanks,

Stephan

2017-05-27 9:32 GMT+02:00 Nick Coghlan :
...
On 27 May 2017 at 03:30, Guido van Rossum  wrote:
...
On Fri, May 26, 2017 at 8:28 AM, Nick Coghlan  wrote:
...
[...] assuming the rest of idea works out
well, we'd eventually like to move to a tiered model where the GIL
becomes a read/write lock. Most code execution in subinterpreters
would then only need a read lock on the GIL, and hence could happily
execute code in parallel with other subinterpreters running on other
cores.
Since the GIL protects refcounts and refcounts are probably the most
frequently written item, I'm skeptical of this.
Likewise - hence my somewhat garbled attempt to explain that actually
doing that would be contingent on the GILectomy folks figuring out
some clever way to cope with the refcounts :)
...
...
By contrast, being able to reliably model Communicating Sequential
Processes in Python without incurring any communications overhead
though (ala goroutines)? Or doing the same with the Actor model (ala
Erlang/BEAM processes)?
Those are *very* interesting language design concepts, and something
where offering a compelling alternative to the current practices of
emulating them with threads or coroutines pretty much requires the
property of zero-copy ownership transfer.
But subinterpreters (which have independent sys.modules dicts) seem a poor
match for that. It feels as if you're speculating about an entirely
different language here, not named Python.
Ah, you're right - the types are all going to be separate as well,
which means "cost of a deep copy" is the cheapest we're going to be
able to get with this model. Anything better than that would require a
more esoteric memory management architecture like the one in
PyParallel.
I guess I'll have to scale back my hopes on that front to be closer to
what Stephan described - even a deep copy equivalent is often going to
be cheaper than a full serialise/transmit/deserialise cycle or some
other form of inter-process communication.
Cheers,
Nick.
--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia

Re: [Python-ideas] Exposing CPython's subinterpreter C-API in the stdlib.

Stephan Houben