[Python-Dev] PEP 554 v3 (new interpreters module)

Sun Oct 8 22:27:17 EDT 2017

On 7 October 2017 at 02:29, Koos Zevenhoven <k7hoven at gmail.com> wrote:

> While I'm actually trying not to say much here so that I can avoid this
> discussion now, here's just a couple of ideas and thoughts from me at this
> point:
>
> (A)
> Instead of sending bytes and receiving memoryviews, one could consider
> sending *and* receiving memoryviews for now. That could then be extended
> into more types of objects in the future without changing the basic concept
> of the channel. Probably, the memoryview would need to be copied (but not
> the data of course). But I'm guessing copying a memoryview would be quite
> fast.
>

The proposal is to allow sending any buffer-exporting object, so sending a
memoryview would be supported.

> This would hopefully require less API changes or additions in the future.
> OTOH, giving it a different name like MemChannel or making it 3rd party
> will buy some more time to figure out the right API. But maybe that's not
> needed.
>

I think having both a memory-centric data channel and an object-centric
data channel would be useful long term, so I don't see a lot of downsides
to starting with the easier-to-implement MemChannel, and then looking at
how to define a plain Channel later.

For example, it occurs to me is that the closest current equivalent we have
to an object level counterpart to the memory buffer protocol would be the
weak reference protocol, wherein a multi-interpreter-aware proxy object
could actually take care of switching interpreters as needed when
manipulating reference counts.

While weakrefs themselves wouldn't be usable in the general case (many
builtin types don't support weak references, and we'd want to support
strong cross-interpreter references anyway), a wrapt-style object proxy
would provide us with a way to maintain a single strong reference to the
original object in its originating interpreter (implicitly switching to
that interpreter as needed), while also maintaining a regular local
reference count on the proxy object in the receiving interpreter.

And here's the neat thing: since subinterpreters share an address space, it
would be possible to experiment with an object-proxy based channel by
passing object pointers over a memoryview based channel.

> (B)
> We would probably then like to pretend that the object coming out the
> other end of a Channel *is* the original object. As long as these channels
> are the only way to directly pass objects between interpreters, there are
> essentially only two ways to tell the difference (AFAICT):
>
> 1. Calling id(...) and sending it over to the other interpreter and
> checking if it's the same.
>
> 2. When the same object is sent twice to the same interpreter. Then one
> can compare the two with id(...) or using the `is` operator.
>
> There are solutions to the problems too:
>
> 1. Send the id() from the sending interpreter along with the sent object
> so that the receiving interpreter can somehow attach it to the object and
> then return it from id(...).
>
> 2. When an object is received, make a lookup in an interpreter-wide cache
> to see if an object by this id has already been received. If yes, take that
> one.
>
> Now it should essentially look like the received object is really "the
> same one" as in the sending interpreter. This should also work with
> multiple interpreters and multiple channels, as long as the id is always
> preserved.
>

I don't personally think we want to expend much (if any) effort on
presenting the illusion that the objects on either end of the channel are
the "same" object, but postponing the question entirely is also one of the
benefits I see to starting with MemChannel, and leaving the object-centric
Channel until later.

> (C)
> One further complication regarding memoryview in general is that
> .release() should probably be propagated to the sending interpreter somehow.
>

Yep, switching interpreters when releasing the buffer is the main reason
you couldn't use a regular memoryview for this purpose - you need a variant
that holds a strong reference to the sending interpreter, and switches back
to it for the buffer release operation.

> (D)
> I think someone already mentioned this one, but would it not be better to
> start a new interpreter in the background in a new thread by default? I
> think this would make things simpler and leave more freedom regarding the
> implementation in the future. If you need to run an interpreter within the
> current thread, you could perhaps optionally do that too.
>

Not really, as that approach doesn't compose as well with existing thread
management primitives like concurrent.futures.ThreadPoolExecutor. It also
doesn't match the way the existing subinterpreter machinery works, where
threads can change their active interpreter.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171009/0e8fa991/attachment.html>