On 7 October 2017 at 02:29, Koos Zevenhoven firstname.lastname@example.org wrote:
While I'm actually trying not to say much here so that I can avoid this discussion now, here's just a couple of ideas and thoughts from me at this point:
(A) Instead of sending bytes and receiving memoryviews, one could consider sending *and* receiving memoryviews for now. That could then be extended into more types of objects in the future without changing the basic concept of the channel. Probably, the memoryview would need to be copied (but not the data of course). But I'm guessing copying a memoryview would be quite fast.
The proposal is to allow sending any buffer-exporting object, so sending a memoryview would be supported.
This would hopefully require less API changes or additions in the future. OTOH, giving it a different name like MemChannel or making it 3rd party will buy some more time to figure out the right API. But maybe that's not needed.
I think having both a memory-centric data channel and an object-centric data channel would be useful long term, so I don't see a lot of downsides to starting with the easier-to-implement MemChannel, and then looking at how to define a plain Channel later.
For example, it occurs to me is that the closest current equivalent we have to an object level counterpart to the memory buffer protocol would be the weak reference protocol, wherein a multi-interpreter-aware proxy object could actually take care of switching interpreters as needed when manipulating reference counts.
While weakrefs themselves wouldn't be usable in the general case (many builtin types don't support weak references, and we'd want to support strong cross-interpreter references anyway), a wrapt-style object proxy would provide us with a way to maintain a single strong reference to the original object in its originating interpreter (implicitly switching to that interpreter as needed), while also maintaining a regular local reference count on the proxy object in the receiving interpreter.
And here's the neat thing: since subinterpreters share an address space, it would be possible to experiment with an object-proxy based channel by passing object pointers over a memoryview based channel.
(B) We would probably then like to pretend that the object coming out the other end of a Channel *is* the original object. As long as these channels are the only way to directly pass objects between interpreters, there are essentially only two ways to tell the difference (AFAICT):
- Calling id(...) and sending it over to the other interpreter and
checking if it's the same.
- When the same object is sent twice to the same interpreter. Then one
can compare the two with id(...) or using the `is` operator.
There are solutions to the problems too:
- Send the id() from the sending interpreter along with the sent object
so that the receiving interpreter can somehow attach it to the object and then return it from id(...).
- When an object is received, make a lookup in an interpreter-wide cache
to see if an object by this id has already been received. If yes, take that one.
Now it should essentially look like the received object is really "the same one" as in the sending interpreter. This should also work with multiple interpreters and multiple channels, as long as the id is always preserved.
I don't personally think we want to expend much (if any) effort on presenting the illusion that the objects on either end of the channel are the "same" object, but postponing the question entirely is also one of the benefits I see to starting with MemChannel, and leaving the object-centric Channel until later.
(C) One further complication regarding memoryview in general is that .release() should probably be propagated to the sending interpreter somehow.
Yep, switching interpreters when releasing the buffer is the main reason you couldn't use a regular memoryview for this purpose - you need a variant that holds a strong reference to the sending interpreter, and switches back to it for the buffer release operation.
(D) I think someone already mentioned this one, but would it not be better to start a new interpreter in the background in a new thread by default? I think this would make things simpler and leave more freedom regarding the implementation in the future. If you need to run an interpreter within the current thread, you could perhaps optionally do that too.
Not really, as that approach doesn't compose as well with existing thread management primitives like concurrent.futures.ThreadPoolExecutor. It also doesn't match the way the existing subinterpreter machinery works, where threads can change their active interpreter.