[Python-Dev] PEP 554 v3 (new interpreters module)

Mon Oct 2 21:31:30 EDT 2017

On Thu, Sep 14, 2017 at 8:44 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Not really, because the only way to ensure object separation (i.e no
> refcounted objects accessible from multiple interpreters at once) with
> a bytes-based API would be to either:
>
> 1. Always copy (eliminating most of the low overhead communications
> benefits that subinterpreters may offer over multiple processes)
> 2. Make the bytes implementation more complicated by allowing multiple
> bytes objects to share the same underlying storage while presenting as
> distinct objects in different interpreters
> 3. Make the output on the receiving side not actually a bytes object,
> but instead a view onto memory owned by another object in a different
> interpreter (a "memory view", one might say)

4. Pass Bytes through directly.

The only problem of which I'm aware is that when Py_DECREF() triggers
Bytes.__del__(), it happens in the current interpreter, which may not
be the "owner" (i.e. allocated the object).  So the solution would be
to make PyBytesType.tp_free() effectively run as a "pending call"
under the owner.  This would require two things:

1. a new PyBytesObject.owner field (PyInterpreterState *), or a
separate owner table, which would be set when the object is passed
through a channel
2. a Py_AddPendingCall() that targets a specific interpreter (which I
expect would be desirable regardless)

Then, when the object has an owner, PyBytesType.tp_free() would add a
pending call on the owner to call PyObject_Del() on the Bytes object.

The catch is that currently "pending" calls (via Py_AddPendingCall)
are run only in the main thread of the main interpreter.  We'd need a
similar mechanism that targets a specific interpreter .

> By contrast, if we allow an actual bytes object to be shared, then
> either every INCREF or DECREF on that bytes object becomes a
> synchronisation point, or else we end up needing some kind of
> secondary per-interpreter refcount where the interpreter doesn't drop
> its shared reference to the original object in its source interpreter
> until the internal refcount in the borrowing interpreter drops to
> zero.

There shouldn't be a need to synchronize on INCREF.  If both
interpreters have at least 1 reference then either one adding a
reference shouldn't be a problem.  If only one interpreter has a
reference then the other won't be adding any references.  If neither
has a reference then neither is going to add any references.  Perhaps
I've missed something.  Under what circumstances would INCREF happen
while the refcount is 0?

On DECREF there shouldn't be a problem except possibly with a small
race between decrementing the refcount and checking for a refcount of
0.  We could address that several different ways, including allowing
the pending call to get queued only once (or being a noop the second
time).

FWIW, I'm not opposed to the CIV/memoryview approach, but want to make
sure we really can't use Bytes before going down that route.

-eric