On 6 October 2017 at 11:48, Eric Snow <ericsnowcurrently@gmail.com> wrote:
> And that's the real pay-off that comes from defining this in terms of the
> memoryview protocol: Py_buffer structs *aren't* Python objects, so it's only
> a regular C struct that gets passed across the interpreter boundary (the
> reference to the original objects gets carried along passively as part of
> the CIV - it never gets *used* in the receiving interpreter).

Yeah, the (PEP 3118) buffer protocol offers precedent in a number of
ways that are applicable to channels here.  I'm simply reticent to
lock PEP 554 into such a specific solution as the buffer-specific CIV.
I'm trying to accommodate anticipated future needs while keeping the
PEP as simple and basic as possible.  It's driving me nuts! :P  Things
were *much* simpler before I added Channels to the PEP. :)

Starting with memory-sharing only doesn't lock us into anything, since you can still add a more flexible kind of channel based on a different protocol later if it turns out that memory sharing isn't enough.

By contrast, if you make the initial channel semantics incompatible with multiprocessing by design, you *will* prevent anyone from experimenting with replicating the shared memory based channel API for communicating between processes :)

That said, if you'd prefer to keep the "Channel" name available for the possible introduction of object channels at a later date, you could call the initial memoryview based channel a "MemChannel".
 
> I don't think we should be touching the behaviour of core builtins solely to
> enable message passing to subinterpreters without a shared GIL.

Keep in mind that I included the above as a possible solution using
tp_share() that would work *after* we stop sharing the GIL.  My point
is that with tp_share() we have a solution that works now *and* will
work later.  I don't care how we use tp_share to do so. :)  I long to
be able to say in the PEP that you can pass bytes through the channel
and get bytes on the other side.

Memory views are a builtin type as well, and they emphasise the practical benefit we're trying to get relative to typical multiprocessing arranagements: zero-copy data sharing.

So here's my proposed experimentation-enabling development strategy:

1. Start out with a MemChannel API, that accepts any buffer-exporting object as input, and outputs only a cross-interpreter memoryview subclass
2. Use that as the basis for the work to get to a per-interpreter locking arrangement that allows subinterpreters to fully exploit multiple CPUs
3. Only then try to design a Channel API that allows for sharing builtin immutable objects between interpreters (bytes, strings, numbers), at a time when you can be certain you won't be inadvertently making it harder to make the GIL a truly per-interpreter lock, rather than the current process global runtime lock.

The key benefit of this approach is that we *know* MemChannel can work: the buffer protocol already operates at the level of C structs and pointers, not Python objects, and there are already plenty of interesting buffer-protocol-supporting objects around, so as long as the CIV switches interpreters at the right time, there aren't any fundamentally new runtime level capabilities needed to implement it.

The lower level MemChannel API could then also be replicated for multiprocessing, while the higher level more speculative object-based Channel API would be specific to subinterpreters (and probably only ever designed and implemented if you first succeed in making subinterpreters sufficiently independent that they don't rely on a process-wide GIL any more).

So I'm not saying "Never design an object-sharing protocol specifically for use with subinterpreters". I'm saying "You don't have a demonstrated need for that yet, so don't try to define it until you do".

 
My mind is drawn to the comparison between that and the question of
CIV vs. tp_share().  CIV would be more like the post-451 import world,
where I expect the CIV would take care of the data sharing operations.
That said, the situation in PEP 554 is sufficiently different that I'm
not convinced a generic CIV protocol would be better.  I'm not sure
how much CIV could do for you over helpers+tp_share.

Anyway, here are the leading approaches that I'm looking at now:

* adding a tp_share slot
  + you send() the object directly and recv() the object coming out of
tp_share()
     (which will probably be the same type as the original)
  + this would eventually require small changes in tp_free for
participating types
  + we would likely provide helpers (eventually), similar to the new
buffer protocol,
     to make it easier to manage sharing data

I'm skeptical about this approach because you'll be designing in a vacuum against future possible constraints that you can't test yet: the inherent complexity in the object sharing protocol will come from *not* having a process-wide GIL, but you'll be starting out with a process-wide GIL still in place. And that means third parties will inevitably rely on the process-wide GIL in their tp_share implementations (despite their best intentions), and you'll end up with the same issue that causes problems for the rest of the C API.

By contrast, if you delay this step until *after* the GIL has successfully been shifted to being per-interpreter, then by the time the new protocol is defined, people will also be able to test their tp_share implementations properly.

At that point, you'd also presumably have evidence of demand to justify the introduction of a new core language protocol, as:

* folks will only complain about the limitations of MemChannel if they're actually using subinterpreters
* the complaints about the limitations of MemChannel would help guide the object sharing protocol design
 
* simulating tp_share via an external global registry (or a registry
on the Channel type)
  + it would still be hard to make work without hooking into tp_free()
* CIVs hard-coded in Channel (or BufferViewChannel, etc.) for specific
types (e.g. buffers)
  + you send() the object like normal, but recv() the view
* a CIV protocol on Channel by which you can add support for more types
  + you send() the object like normal but recv() the view
  + could work through subclassing or a registry
  + a lot of conceptual similarity with tp_share+tp_free
* a CIV-like proxy
  + you wrap the object, send() the proxy, and recv() a proxy
  + this is entirely compatible with tp_share()

* Allow for multiple channel types, such that MemChannel is merely the *first* channel type, rather than the *only* channel type
  + Allows PEP 554 to be restricted to things we already know can be made to work
  + Doesn't block the introduction of an object-sharing based Channel in some future release
  + Allows for at least some channel types to be adapted for use with shared memory and multiprocessing
 
Here are what I consider the key metrics relative to the utility of a
solution (not in any significant order):

* how hard to understand as a Python programmer?

Not especially important yet - this is more a criterion for the final API, not the initial experimental platform.
 
* how much extra work (if any) for folks calling Channel.send()?
* how much extra work (if any) for folks calling Channel.recv()?

I don't think either are particularly important yet, although we also don't want to raise any pointless barriers to experimentation.
 
* how complex is the CPython implementation?

This is critical, since we want to minimise any potential for undesirable side effects on regular single interpreter code.
 
* how hard to understand as a type author (wanting to add support for
their type)?
* how hard to add support for a new type?
* what variety of types could be supported?
* what breadth of experimentation opens up?

You missed the big one: what risk does the initial channel design pose to the underlying objective of making the GIL a genuinely per-interpreter lock?

If we don't eventually reach the latter goal, then subinterpreters won't really offer much in the way of compelling benefits over just using a thread pool and queue.Queue.

MemChannel poses zero additional risk to that, since we wouldn't be sharing actual Python objects between interpreters, only C pointers and structs.

By contrast, introducing an object channel early poses significant new risks to that goal, since it will force you to solve hard protocol design and refcount management problems *before* making the switch, rather than being able to defer the design of the object channel protocol until *after* you've already enabled the ability to run subinterpreters in completely independent threads.
 
The most important thing to me is keeping things simple for Python
programmers.  After that is ease-of-use for type authors.  However, I
also want to put us in a good position in 3.7 to experiment
extensively with subinterpreters, so that's a big consideration.

Consequently, for PEP 554 my goal is to find a solution for object
sharing that keeps things simple in Python while laying a basic
foundation we can build on at the C level, so we don't get locked in
but still maximize our opportunities to experiment. :)

I think our priorities are quite different then, as I believe PEP 554 should be focused on defining a relatively easy to implement API that nevertheless makes it possible to write interesting programs while working on the goal of making the GIL per-interpreter, without worrying too much about whether or not the initial cross-interpreter communication channels closely resemble the final ones that will be intended for more general use.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia