[Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.

Stephan Houben stephanh42 at gmail.com
Wed Jul 18 14:49:34 EDT 2018


Hi Eric, Antoine, all

Antoine said that what I proposed earlier was very similar to what Eric
is trying to do, but from the direction the discussion has taken so far
that appears not to be the case.

I will therefore try to clarify my proposal.

Basically, what I am suggesting is a direct translation of Javascript's
Web Worker API (
https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API)
to Python.

The Web Worker API is generally considered a "share-nothing" approach,
although
as we will see some state can be shared.

The basic principle is that any object lives in a single Worker (Worker =
subinterpreter).
If a message is send from Worker A to Worker B, the message is not shared,
rather the so-called "structured clone" algorithm is used to create
recursively a NEW message
object in Worker B. This is roughly equivalent to pickling in A and then
unpickling in B,

Of course, this may become a bottleneck if large amounts of data need to be
communicated.
Therefore, there is a special object type designed to provide a view upon a
piece
of shared memory:  SharedArrayBuffer. Notable, this only provides a view
upon
raw "C"-style data (ints or floats or whatever), not on Javascript objects.

To translate this to the Python situation: each Python object is owned by a
single
subinterpreter, and may only be manipulated by a thread which holds the GIL
of that particular subinterpreter. Message sending between subinterpreters
will
require the message objects to be "structured cloned".

Certain C extension types may override what structured cloning means for
them.
In particular, some C extension types may have a two-layer structure where
the Py_Object contains a refcounted pointer to the actual data.
The structured cloning on such an object may create a second Py_Object which
references the same underlying object.
This secondary refcount will need to be properly atomic, since it may be
manipulated
from multiple subinterpreters.

In this way, interpreter-shared data structures can be implemented.
However, all the "normal" Python objects are not shared and can continue
to use the current, non-atomic refcounting implementation.

Hope this clarifies my proposal.

Stephan


2018-07-18 19:58 GMT+02:00 Eric Snow <ericsnowcurrently at gmail.com>:

> On Wed, Jul 18, 2018 at 1:37 AM Barry Scott <barry at barrys-emacs.org>
> wrote:
> > Let me try a longer answer. The inc+test and dec+test do not require a
> > lock if coded correctly. All OS and run times have solved this to provide
> > locks. All processors provide the instructions that are the building
> blocks
> > for lock primitives.
> >
> > You cannot mutate a mutable python object that is not protected with the
> GIL as
> > the change of state involves multiple parts of the object changing.
> >
> > If you know that an object is immutable then you could only do a check
> on the
> > ref count as you will never change the state of the object beyond its
> ref count.
> > To access the object you only have to ensure it will not be deleted,
> which the
> > ref count guarantees. The delete of the immutable object is then the
> only job
> > that the original interpreter must do.
>
> Perhaps we're agreeing?  Other than the single decref at when
> "releasing" the object, it won't ever be directly modified (even the
> refcount) in the other interpreter.  In effect that interpreter holds
> a reference to the object which prevents GC in the "owning"
> interpreter (the corresponding incref happened in that original
> interpreter before the object was "shared").  The only issue is how to
> "release" the object in the other interpreter so that the decref
> happens in the "owning" interpreter.  As earlier noted, I'm planning
> on taking advantage of the exiting ceval "pending calls" machinery.
>
> So I'm not sure where an atomic int would factor in.  If you mean
> switching the exiting refcount to an atomic int for the sake of the
> cross-interpreter decref then that's not going to happen, as Ronald
> suggested.  Larry could tell you about his Gilectomy experience. :)
>
> Are you suggesting something like a second "cross-interpreter
> refcount", which would be atomic, and add a check in Py_DECREF?  That
> would imply an extra cross-interpreter-oriented C-API to parallel
> Py_DECREF.  It would also mean either adding another field to PyObject
> (yikes!) or keeping a separate table for tracking cross-interpreter
> references.  I'm not sure any of that would be better than the
> alternative I'm pursuing.  Then again, I've considered tracking which
> interpreters hold a "reference" to an object, which isn't that
> different.
>
> -eric
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180718/4af66d10/attachment.html>


More information about the Python-ideas mailing list