Hi Eric, Antoine, all
Antoine said that what I proposed earlier was very similar to what Eric
is trying to do, but from the direction the discussion has taken so far
that appears not to be the case.
I will therefore try to clarify my proposal.
Basically, what I am suggesting is a direct translation of Javascript's
Web Worker API (
https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API)
to Python.
The Web Worker API is generally considered a "share-nothing" approach,
although
as we will see some state can be shared.
The basic principle is that any object lives in a single Worker (Worker =
subinterpreter).
If a message is send from Worker A to Worker B, the message is not shared,
rather the so-called "structured clone" algorithm is used to create
recursively a NEW message
object in Worker B. This is roughly equivalent to pickling in A and then
unpickling in B,
Of course, this may become a bottleneck if large amounts of data need to be
communicated.
Therefore, there is a special object type designed to provide a view upon a
piece
of shared memory: SharedArrayBuffer. Notable, this only provides a view
upon
raw "C"-style data (ints or floats or whatever), not on Javascript objects.
To translate this to the Python situation: each Python object is owned by a
single
subinterpreter, and may only be manipulated by a thread which holds the GIL
of that particular subinterpreter. Message sending between subinterpreters
will
require the message objects to be "structured cloned".
Certain C extension types may override what structured cloning means for
them.
In particular, some C extension types may have a two-layer structure where
the Py_Object contains a refcounted pointer to the actual data.
The structured cloning on such an object may create a second Py_Object which
references the same underlying object.
This secondary refcount will need to be properly atomic, since it may be
manipulated
from multiple subinterpreters.
In this way, interpreter-shared data structures can be implemented.
However, all the "normal" Python objects are not shared and can continue
to use the current, non-atomic refcounting implementation.
Hope this clarifies my proposal.
Stephan
2018-07-18 19:58 GMT+02:00 Eric Snow
On Wed, Jul 18, 2018 at 1:37 AM Barry Scott
wrote: Let me try a longer answer. The inc+test and dec+test do not require a lock if coded correctly. All OS and run times have solved this to provide locks. All processors provide the instructions that are the building blocks for lock primitives.
You cannot mutate a mutable python object that is not protected with the GIL as the change of state involves multiple parts of the object changing.
If you know that an object is immutable then you could only do a check on the ref count as you will never change the state of the object beyond its ref count. To access the object you only have to ensure it will not be deleted, which the ref count guarantees. The delete of the immutable object is then the only job that the original interpreter must do.
Perhaps we're agreeing? Other than the single decref at when "releasing" the object, it won't ever be directly modified (even the refcount) in the other interpreter. In effect that interpreter holds a reference to the object which prevents GC in the "owning" interpreter (the corresponding incref happened in that original interpreter before the object was "shared"). The only issue is how to "release" the object in the other interpreter so that the decref happens in the "owning" interpreter. As earlier noted, I'm planning on taking advantage of the exiting ceval "pending calls" machinery.
So I'm not sure where an atomic int would factor in. If you mean switching the exiting refcount to an atomic int for the sake of the cross-interpreter decref then that's not going to happen, as Ronald suggested. Larry could tell you about his Gilectomy experience. :)
Are you suggesting something like a second "cross-interpreter refcount", which would be atomic, and add a check in Py_DECREF? That would imply an extra cross-interpreter-oriented C-API to parallel Py_DECREF. It would also mean either adding another field to PyObject (yikes!) or keeping a separate table for tracking cross-interpreter references. I'm not sure any of that would be better than the alternative I'm pursuing. Then again, I've considered tracking which interpreters hold a "reference" to an object, which isn't that different.
-eric _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/