[Python-Dev] PEP 554 v3 (new interpreters module)

Mon Sep 18 06:46:36 EDT 2017

Hi,

First my high-level opinion about the PEP: the CSP model can probably
be already implemented using Queues.  To me, the interesting promise of
subinterpreters is if they allow to remove the GIL while sharing memory
for big objects (such as Numpy arrays).  This means the PEP should
probably focus on potential concurrency improvements rather than try to
faithfully follow the CSP model.

Other than that, a bunch of detailed comments follow:

On Wed, 13 Sep 2017 18:44:31 -0700
Eric Snow <ericsnowcurrently at gmail.com> wrote:
> 
> API for interpreters
> --------------------
> 
> The module provides the following functions:
> 
> ``list_all()``::
> 
>    Return a list of all existing interpreters.

See my naming proposal in the previous thread.

> 
>    run(source_str, /, **shared):
> 
>       Run the provided Python source code in the interpreter.  Any
>       keyword arguments are added to the interpreter's execution
>       namespace.

"Execution namespace" specifically means the __main__ module in the
target interpreter, right?

>  If any of the values are not supported for sharing
>       between interpreters then RuntimeError gets raised.  Currently
>       only channels (see "create_channel()" below) are supported.
> 
>       This may not be called on an already running interpreter.  Doing
>       so results in a RuntimeError.

I would distinguish between both error cases: RuntimeError for calling
run() on an already running interpreter, ValueError for values which
are not supported for sharing.

>       Likewise, if there is any uncaught
>       exception, it propagates into the code where "run()" was called.

That makes it a bit harder to differentiate with errors raised by run()
itself (see above), though how much of an annoyance this is remains
unclear.  The more litigious implication, though, is that it forces the
interpreter to support migration of arbitrary objects from one
interpreter to another (since a traceback keeps all local variables
alive).

> API for sharing data
> --------------------
> 
> The mechanism for passing objects between interpreters is through
> channels.  A channel is a simplex FIFO similar to a pipe.  The main
> difference is that channels can be associated with zero or more
> interpreters on either end.

So it seems channels have become more complicated now?  Is it important
to support multi-producer multi-consumer channels?

>  Unlike queues, which are also many-to-many,
> channels have no buffer.

How does it work?  Does send() block until someone else calls recv()?
That does not sound like a good idea to me.  I don't think it's a
coincidence that the most varied kinds of I/O (from socket or file IO
to threading Queues to multiprocessing Pipes) have non-blocking send().

send() blocking until someone else calls recv() is not only bad for
performance, it also increases the likelihood of deadlocks.

>    recv_nowait(default=None):
> 
>       Return the next object from the channel.  If none have been sent
>       then return the default.  If the channel has been closed
>       then EOFError is raised.
> 
>    close():
> 
>       No longer associate the current interpreter with the channel (on
>       the receiving end).  This is a noop if the interpreter isn't
>       already associated.  Once an interpreter is no longer associated
>       with the channel, subsequent (or current) send() and recv() calls
>       from that interpreter will raise EOFError.

EOFError normally means the *other* (sending) side has closed the
channel (but it becomes complicated with a multi-producer multi-consumer
setup...). When *this* side has closed the channel, we should raise
ValueError.

>  The Python runtime
>       will garbage collect all closed channels.  Note that "close()" is
>       automatically called when it is no longer used in the current
>       interpreter.

"No longer used" meaning it loses all references in this interpreter?

>    send(obj):
> 
>        Send the object to the receiving end of the channel.  Wait until
>        the object is received.  If the channel does not support the
>        object then TypeError is raised.  Currently only bytes are
>        supported.  If the channel has been closed then EOFError is
>        raised.

Similar remark as above (EOFError vs. ValueError).
More generally, send() raising EOFError sounds unheard of.

A sidenote: context manager support (__enter__ / __exit__) on channels
would sound more useful to me than iteration support.

> Initial support for buffers in channels
> ---------------------------------------
> 
> An alternative to support for bytes in channels in support for
> read-only buffers (the PEP 3119 kind).

Probably you mean PEP 3118.

> Then ``recv()`` would return
> a memoryview to expose the buffer in a zero-copy way.

It will probably not do much if you only can pass buffers and not
structured objects, because unserializing (e.g. unpickling) from a
buffer will still copy memory around.

To pass a Numpy array, for example, you not only need to pass its
contents but also its metadata (its value type -- named "dtype" --, its
shape and strides).  This may be serialized as simple tuples of atomic
types (str, int, bytes, other tuples), but you want to include a
memoryview of the data area somewhere in those tuples.

(and, of course, at some point, this will feel like reinventing
pickle :)) but pickle has no mechanism to avoid memory copies, so it
can't readily be reused here -- otherwise you're just reinventing
multiprocessing...)

> timeout arg to pop() and push()
> -------------------------------

pop() and push() don't exist anymore :-)

> Synchronization Primitives
> --------------------------
> 
> The ``threading`` module provides a number of synchronization primitives
> for coordinating concurrent operations.  This is especially necessary
> due to the shared-state nature of threading.  In contrast,
> subinterpreters do not share state.  Data sharing is restricted to
> channels, which do away with the need for explicit synchronization.

I think this rationale confuses Python-level data sharing with
process-level data sharing.  The main point of subinterpreters
(compared to multiprocessing) is that they live in the same OS
process.  So it's really not true that you can't share a low-level
synchronization primitive (say a semaphore) between subinterpreters.

(also see multiprocessing/synchronize.py, which implements all
synchronization primitives using basic low-level semaphores)

> Solutions include:
> 
> * a ``create()`` arg to indicate resetting ``__main__`` after each
>   ``run`` call
> * an ``Interpreter.reset_main`` flag to support opting in or out
>   after the fact
> * an ``Interpreter.reset_main()`` method to opt in when desired

This would all be a false promise.  Persistent state lives in other
places than __main__ (for example the loaded modules and their
respective configurations - think logging or decimal).

> Use queues instead of channels
> ------------------------------
> 
> The main difference between queues and channels is that queues support
> buffering.  This would complicate the blocking semantics of ``recv()``
> and ``send()``.  Also, queues can be built on top of channels.

But buffering with background threads in pure Python will be order
of magnitudes slower than optimized buffering in a custom low-level
implementation.  It would be a pity if a subinterpreters Queue ended
out as slow as a multiprocessing Queue.

Regards

Antoine.