[Python-Dev] PEP 554 v3 (new interpreters module)

Eric Snow ericsnowcurrently at gmail.com
Fri Sep 22 21:09:01 EDT 2017

Thanks for the feedback, Antoine.  Sorry for the delay; it's been a
busy week for me.  I just pushed an updated PEP to the repo.  Once
I've sorted out the question of passing bytes through channels I plan
on posting the PEP to the list again for another round of discussion.
In the meantime, I've replied below in-line.


On Mon, Sep 18, 2017 at 4:46 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> First my high-level opinion about the PEP: the CSP model can probably
> be already implemented using Queues.  To me, the interesting promise of
> subinterpreters is if they allow to remove the GIL while sharing memory
> for big objects (such as Numpy arrays).  This means the PEP should
> probably focus on potential concurrency improvements rather than try to
> faithfully follow the CSP model.

Please elaborate.  I'm interested in understanding what you mean here.
Do you have some subinterpreter-based concurrency improvements in
mind?  What aspect of CSP is the PEP following too faithfully?

>> ``list_all()``::
>>    Return a list of all existing interpreters.
> See my naming proposal in the previous thread.

Sorry, your previous comment slipped through the cracks.  You suggested:

    As for the naming, let's make it both unconfusing and explicit?
    How about three functions: `all_interpreters()`, `running_interpreters()`
    and `idle_interpreters()`, for example?

As to "all_interpreters()", I suppose it's the difference between
"interpreters.all_interpreters()" and "interpreters.list_all()".  To
me the latter looks better.

As to "running_interpreters()" and "idle_interpreters()", I'm not sure
what the benefit would be.  You can compose either list manually with
a simple comprehension:

    [interp for interp in interpreters.list_all() if interp.is_running()]
    [interp for interp in interpreters.list_all() if not interp.is_running()]

>>    run(source_str, /, **shared):
>>       Run the provided Python source code in the interpreter.  Any
>>       keyword arguments are added to the interpreter's execution
>>       namespace.
> "Execution namespace" specifically means the __main__ module in the
> target interpreter, right?

Right.  It's explained in more detail a little further down and
elsewhere in the PEP.  I've updated the PEP to explicitly mention
__main__ here too.

>>  If any of the values are not supported for sharing
>>       between interpreters then RuntimeError gets raised.  Currently
>>       only channels (see "create_channel()" below) are supported.
>>       This may not be called on an already running interpreter.  Doing
>>       so results in a RuntimeError.
> I would distinguish between both error cases: RuntimeError for calling
> run() on an already running interpreter, ValueError for values which
> are not supported for sharing.

Good point.

>>       Likewise, if there is any uncaught
>>       exception, it propagates into the code where "run()" was called.
> That makes it a bit harder to differentiate with errors raised by run()
> itself (see above), though how much of an annoyance this is remains
> unclear.  The more litigious implication, though, is that it forces the
> interpreter to support migration of arbitrary objects from one
> interpreter to another (since a traceback keeps all local variables
> alive).

Yeah, the proposal to propagate exceptions out of the subinterpreter
is still rather weak.  I've added some notes the the PEP about this
open issue.

>> The mechanism for passing objects between interpreters is through
>> channels.  A channel is a simplex FIFO similar to a pipe.  The main
>> difference is that channels can be associated with zero or more
>> interpreters on either end.
> So it seems channels have become more complicated now?  Is it important
> to support multi-producer multi-consumer channels?

To me it made the API simpler.  The change did introduce the "close()"
method, which I suppose could be confusing.  However, I'm sure that in
practice it won't be.  In contrast, the FIFO/pipe-based API that I had
before required passing names around, required more calls, required
managing the channel/interpreter relationship more carefully, and made
it hard to follow that relationship.

>>  Unlike queues, which are also many-to-many,
>> channels have no buffer.
> How does it work?  Does send() block until someone else calls recv()?
> That does not sound like a good idea to me.

Correct "send()" blocks until the other end receives (if ever).
Likewise "recv()" blocks until the other end sends.  This specific
behavior is probably the main thing I borrowed from CSP.  It is *the*
synchronization mechanism.  Given the isolated nature of
subinterpreters, I consider using this concept from CSP to be a good

>  I don't think it's a
> coincidence that the most varied kinds of I/O (from socket or file IO
> to threading Queues to multiprocessing Pipes) have non-blocking send().

Interestingly, you can set sockets to blocking mode, in which case
send() will block until there is room in the kernel buffer.  Likewise,
queue.Queue.send() supports blocking, in addition to providing a
put_nowait() method.

Note that the PEP provides "recv_nowait()" and "send_nowait()" (names
inspired by queue.Queue), allowing for a non-blocking send.  It's just
not the default.  I deliberated for a little while on which one to
make the default.

In the end I went with blocking-by-default to stick to the CSP model.
However, I want to do what's most practical for users.  I can imagine
folks at first not expecting blocking send by default.  However, it
otherwise isn't clear yet which one is better for interpreter
channels.  I'll add on "open question" about switching to
non-blocking-by-default for send().

> send() blocking until someone else calls recv() is not only bad for
> performance,

What is the performance problem?

> it also increases the likelihood of deadlocks.

How much of a problem will deadlocks be in practice?  (FWIW, CSP
provides rigorous guarantees about deadlock detection (which Go
leverages), though I'm not sure how much benefit that can offer such a
dynamic language as Python.)  Regardless, I'll make sure the PEP
discusses deadlocks.

> EOFError normally means the *other* (sending) side has closed the
> channel (but it becomes complicated with a multi-producer multi-consumer
> setup...). When *this* side has closed the channel, we should raise
> ValueError.

I've fixed this in the PEP.

>>  The Python runtime
>>       will garbage collect all closed channels.  Note that "close()" is
>>       automatically called when it is no longer used in the current
>>       interpreter.
> "No longer used" meaning it loses all references in this interpreter?

Correct.  I've clarified this in the PEP.

> Similar remark as above (EOFError vs. ValueError).
> More generally, send() raising EOFError sounds unheard of.

Hmm.  I've fixed this in the PEP, but perhaps using EOFError here (and
even for read()) isn't right.  I was drawing inspiration from pipes,
but certainly the semantics aren't exactly the same.  So it may make
sense to use something else less I/O-related, like a new exception
type in the "interpreters" module.  I'll make a note in the PEP about

> A sidenote: context manager support (__enter__ / __exit__) on channels
> would sound more useful to me than iteration support.

Yeah, I can see that.  FWIW, I've dropped __next__() from the PEP.
I've also added a note about added context manager support.

>> An alternative to support for bytes in channels in support for
>> read-only buffers (the PEP 3119 kind).
> Probably you mean PEP 3118.

Yep. :)

>> Then ``recv()`` would return
>> a memoryview to expose the buffer in a zero-copy way.
> It will probably not do much if you only can pass buffers and not
> structured objects, because unserializing (e.g. unpickling) from a
> buffer will still copy memory around.
> To pass a Numpy array, for example, you not only need to pass its
> contents but also its metadata (its value type -- named "dtype" --, its
> shape and strides).  This may be serialized as simple tuples of atomic
> types (str, int, bytes, other tuples), but you want to include a
> memoryview of the data area somewhere in those tuples.
> (and, of course, at some point, this will feel like reinventing
> pickle :)) but pickle has no mechanism to avoid memory copies, so it
> can't readily be reused here -- otherwise you're just reinventing
> multiprocessing...)

I'm still working through all the passing-buffers-through-channels
feedback, so I'll defer on a reply for now. :)

>> timeout arg to pop() and push()
>> -------------------------------
> pop() and push() don't exist anymore :-)

Fixed! :)

>> Synchronization Primitives
>> --------------------------
>> The ``threading`` module provides a number of synchronization primitives
>> for coordinating concurrent operations.  This is especially necessary
>> due to the shared-state nature of threading.  In contrast,
>> subinterpreters do not share state.  Data sharing is restricted to
>> channels, which do away with the need for explicit synchronization.
> I think this rationale confuses Python-level data sharing with
> process-level data sharing.  The main point of subinterpreters
> (compared to multiprocessing) is that they live in the same OS
> process.  So it's really not true that you can't share a low-level
> synchronization primitive (say a semaphore) between subinterpreters.

I'm not sure I understand your concern here.  Perhaps I used the word
"sharing" too ambiguously?  By "sharing" I mean that the two actors
have read access to something that at least one of them can modify.
If they both only have read-only access then it's effectively the same
as if they are not sharing.

While I can imagine the *possibility* (some day) of an opt-in
mechanism to share objects (r/rw or rw/rw), that is definitely not a
part of this PEP.  I expect that in reality we will only ever pass
immutable data between interpreters.  So I'm unclear on what need
there might be for any synchronization primitives other than what is
inherent to channels.

>> * a ``create()`` arg to indicate resetting ``__main__`` after each
>>   ``run`` call
>> * an ``Interpreter.reset_main`` flag to support opting in or out
>>   after the fact
>> * an ``Interpreter.reset_main()`` method to opt in when desired
> This would all be a false promise.  Persistent state lives in other
> places than __main__ (for example the loaded modules and their
> respective configurations - think logging or decimal).

I've added a bit more explanation to the PEP to clarify this point.

>> The main difference between queues and channels is that queues support
>> buffering.  This would complicate the blocking semantics of ``recv()``
>> and ``send()``.  Also, queues can be built on top of channels.
> But buffering with background threads in pure Python will be order
> of magnitudes slower than optimized buffering in a custom low-level
> implementation.  It would be a pity if a subinterpreters Queue ended
> out as slow as a multiprocessing Queue.

I agree.  I'm entirely open to supporting other object-passing types,
including adding low-level implementations.  I've added a note to the
PEP to that effect.

However, I wanted to start off with the most basic object-passing
type, and I felt that channels provides the simplest solution.  My
goal is to get a basic API landed in 3.7 and then build on it from
there for 3.8.

That said, in the interest of enabling extra utility in the near-term,
I expect that we will be able to design the PyInterpreterState changes
(few as they are) in such a way that a C-extension could implement an
efficient multi-interpreter Queue type that would run under 3.7.
Actually, would that be strictly necessary if you can interact with
channels without the GIL in the C-API?  Regardless, I'll make a note
in the PEP about the relationship between C-API and implementing an
efficient multi-interepter Queue.  I suppose that means I need to add
C-API changes to the PEP (which I had wanted to avoid).

More information about the Python-Dev mailing list