Hi Eric, On Fri, 22 Sep 2017 19:09:01 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
Please elaborate. I'm interested in understanding what you mean here. Do you have some subinterpreter-based concurrency improvements in mind? What aspect of CSP is the PEP following too faithfully?
See below the discussion of blocking send()s :-)
As to "running_interpreters()" and "idle_interpreters()", I'm not sure what the benefit would be. You can compose either list manually with a simple comprehension:
[interp for interp in interpreters.list_all() if interp.is_running()] [interp for interp in interpreters.list_all() if not interp.is_running()]
There is a inherit race condition in doing that, at least if interpreters are running in multiple threads (which I assume is going to be the overly dominant usage model). That is why I'm proposing all three variants.
I don't think it's a coincidence that the most varied kinds of I/O (from socket or file IO to threading Queues to multiprocessing Pipes) have non-blocking send().
Interestingly, you can set sockets to blocking mode, in which case send() will block until there is room in the kernel buffer.
Yes, but there *is* a kernel buffer. Which is the whole point of my comment: most alike primitives have internal buffering to prevent the user-facing send() API from blocking in the common case.
Likewise, queue.Queue.send() supports blocking, in addition to providing a put_nowait() method.
queue.Queue.put() never blocks in the usual case (*), which is of an unbounded queue. Only bounded queues (created with an explicit non-zero max_size parameter) can block in Queue.put(). (*) and therefore also never deadlocks :-)
Note that the PEP provides "recv_nowait()" and "send_nowait()" (names inspired by queue.Queue), allowing for a non-blocking send.
True, but it's not the same thing at all. In the objects I mentioned, send() mostly doesn't block and doesn't fail either. In your model, send_nowait() will routinely fail with an error if a recipient isn't immediately available to recv the data.
send() blocking until someone else calls recv() is not only bad for performance,
What is the performance problem?
Intuitively, there must be some kind of context switch (interpreter switch?) at each send() call to let the other end receive the data, since you don't have any internal buffering. Also, suddenly an interpreter's ability to exploit CPU time is dependent on another interpreter's ability to consume data in a timely manner (what if the other interpreter is e.g. stuck on some disk I/O?). IMHO it would be better not to have such coupling.
it also increases the likelihood of deadlocks.
How much of a problem will deadlocks be in practice?
I expect more often than expected, in complex systems :-) For example, you could have a recv() loop that also from time to time send()s some data on another queue, depending on what is received. But if that send()'s recipient also has the same structure (a recv() loop which send()s from time to time), then it's easy to imagine to two getting in a deadlock.
(FWIW, CSP provides rigorous guarantees about deadlock detection (which Go leverages), though I'm not sure how much benefit that can offer such a dynamic language as Python.)
Hmm... deadlock detection is one thing, but when detected you must still solve those deadlock issues, right?
I'm not sure I understand your concern here. Perhaps I used the word "sharing" too ambiguously? By "sharing" I mean that the two actors have read access to something that at least one of them can modify. If they both only have read-only access then it's effectively the same as if they are not sharing.
Right. What I mean is that you *can* share very simple "data" under the form of synchronization primitives. You may want to synchronize your interpreters even they don't share user-visible memory areas. The point of synchronization is not only to avoid memory corruption but also to regulate and orchestrate processing amongst multiple workers (for example processes or interpreters). For example, a semaphore is an easy way to implement "I want no more than N workers to do this thing at the same time" ("this thing" can be something such as disk I/O). Regards Antoine.