Hi Eric, Antoine, all

Antoine said that what I proposed earlier was very similar to what Eric
is trying to do, but from the direction the discussion has taken so far
that appears not to be the case.

I will therefore try to clarify my proposal.

Basically, what I am suggesting is a direct translation of Javascript's
Web Worker API (https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API)
to Python.

The Web Worker API is generally considered a "share-nothing" approach, although
as we will see some state can be shared.

The basic principle is that any object lives in a single Worker (Worker = subinterpreter).
If a message is send from Worker A to Worker B, the message is not shared,
rather the so-called "structured clone" algorithm is used to create recursively a NEW message
object in Worker B. This is roughly equivalent to pickling in A and then unpickling in B,

Of course, this may become a bottleneck if large amounts of data need to be communicated.
Therefore, there is a special object type designed to provide a view upon a piece
of shared memory:  SharedArrayBuffer. Notable, this only provides a view upon
raw "C"-style data (ints or floats or whatever), not on Javascript objects.

To translate this to the Python situation: each Python object is owned by a single
subinterpreter, and may only be manipulated by a thread which holds the GIL
of that particular subinterpreter. Message sending between subinterpreters will
require the message objects to be "structured cloned".

Certain C extension types may override what structured cloning means for them.
In particular, some C extension types may have a two-layer structure where
the Py_Object contains a refcounted pointer to the actual data.
The structured cloning on such an object may create a second Py_Object which
references the same underlying object.
This secondary refcount will need to be properly atomic, since it may be manipulated
from multiple subinterpreters.

In this way, interpreter-shared data structures can be implemented.
However, all the "normal" Python objects are not shared and can continue
to use the current, non-atomic refcounting implementation.

Hope this clarifies my proposal.

Stephan


2018-07-18 19:58 GMT+02:00 Eric Snow <ericsnowcurrently@gmail.com>:
On Wed, Jul 18, 2018 at 1:37 AM Barry Scott <barry@barrys-emacs.org> wrote:
> Let me try a longer answer. The inc+test and dec+test do not require a
> lock if coded correctly. All OS and run times have solved this to provide
> locks. All processors provide the instructions that are the building blocks
> for lock primitives.
>
> You cannot mutate a mutable python object that is not protected with the GIL as
> the change of state involves multiple parts of the object changing.
>
> If you know that an object is immutable then you could only do a check on the
> ref count as you will never change the state of the object beyond its ref count.
> To access the object you only have to ensure it will not be deleted, which the
> ref count guarantees. The delete of the immutable object is then the only job
> that the original interpreter must do.

Perhaps we're agreeing?  Other than the single decref at when
"releasing" the object, it won't ever be directly modified (even the
refcount) in the other interpreter.  In effect that interpreter holds
a reference to the object which prevents GC in the "owning"
interpreter (the corresponding incref happened in that original
interpreter before the object was "shared").  The only issue is how to
"release" the object in the other interpreter so that the decref
happens in the "owning" interpreter.  As earlier noted, I'm planning
on taking advantage of the exiting ceval "pending calls" machinery.

So I'm not sure where an atomic int would factor in.  If you mean
switching the exiting refcount to an atomic int for the sake of the
cross-interpreter decref then that's not going to happen, as Ronald
suggested.  Larry could tell you about his Gilectomy experience. :)

Are you suggesting something like a second "cross-interpreter
refcount", which would be atomic, and add a check in Py_DECREF?  That
would imply an extra cross-interpreter-oriented C-API to parallel
Py_DECREF.  It would also mean either adding another field to PyObject
(yikes!) or keeping a separate table for tracking cross-interpreter
references.  I'm not sure any of that would be better than the
alternative I'm pursuing.  Then again, I've considered tracking which
interpreters hold a "reference" to an object, which isn't that
different.

-eric
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/