New blog post: Notes on structured concurrency, or: Go statement considered harmful
Hi all, I just posted another essay on concurrent API design: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons... This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it. (Guido: this is the one you should read :-). Or if it's too much, you can jump to the conclusion [1], and I'm happy to come find you somewhere with a whiteboard, if that'd be helpful!) -n [1] https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons... -- Nathaniel J. Smith -- https://vorpus.org
Interesting, thanks On Wed, Apr 25, 2018 at 12:24 PM Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I just posted another essay on concurrent API design:
https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it.
(Guido: this is the one you should read :-). Or if it's too much, you can jump to the conclusion [1], and I'm happy to come find you somewhere with a whiteboard, if that'd be helpful!)
-n
[1] https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
-- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
On Wed, 25 Apr 2018 02:24:15 -0700 Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I just posted another essay on concurrent API design:
https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it.
My experience is indeed that something like the nursery construct would make concurrent programming much more robust in complex cases. This is a great explanation why. API note: I would expect to be able to use it this way: class MyEndpoint: def __init__(self): self._nursery = open_nursery() # Lots of behaviour methods that can put new tasks in the nursery def close(self): self._nursery.close() Also perhaps more finegrained shutdown routines such as: * Nursery.join(cancel_after=None): wait for all tasks to join, cancel the remaining ones after the given timeout Regards Antoine.
On Wed, Apr 25, 2018 at 3:17 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 25 Apr 2018 02:24:15 -0700 Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I just posted another essay on concurrent API design:
https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it.
My experience is indeed that something like the nursery construct would make concurrent programming much more robust in complex cases. This is a great explanation why.
Thanks!
API note: I would expect to be able to use it this way:
class MyEndpoint:
def __init__(self): self._nursery = open_nursery()
# Lots of behaviour methods that can put new tasks in the nursery
def close(self): self._nursery.close()
You might expect to be able to use it that way, but you can't! The 'async with' part of 'async with open_nursery()' is mandatory. This is what I mean about it forcing you to rethink things, and why I think there is room for genuine controversy :-). (Just like there was about goto -- it's weird to think that it could have turned out differently in hindsight, but people really did have valid concerns...) I think the pattern we're settling on for this particular case is: class MyEndpoint: def __init__(self, nursery, ...): self._nursery = nursery # methods here that use nursery @asynccontextmanager async def open_my_endpoint(...): async with trio.open_nursery() as nursery: yield MyEndpoint(nursery, ...) Then most end-users do 'async with open_my_endpoint() as endpoint:' and then use the 'endpoint' object inside the block; or if you have some special reason why you need to have multiple endpoints in the same nursery (e.g. you have an unbounded number of endpoints and don't want to have to somehow write an unbounded number of 'async with' blocks in your source code), then you can call MyEndpoint() directly and pass an explicit nursery. A little bit of extra fuss, but not too much. So that's how you handle it. Why do we make you jump through these hoops? The problem is, we want to enforce that each nursery object's lifetime is bound to the lifetime of a calling frame. The point of the 'async with' in 'async with open_nursery()' is to perform this binding. To reduce errors, open_nursery() doesn't even return a nursery object – only open_nursery().__aenter__() does that. Otherwise, if a task in the nursery has an unhandled error, we have nowhere to report it (among other issues). Of course this is Python, so you can always do gross hacks like calling __aenter__ yourself, but then you're responsible for making sure the context manager semantics are respected. In most systems you'd expect this kind of thing to syntactically enforced as part of the language; it's actually pretty amazing that Trio is able to makes things work as well as it can as a "mere library". It's really a testament to how much thought has been put into Python -- other languages don't really have any equivalent to with or Python's generator-based async/await.
Also perhaps more finegrained shutdown routines such as:
* Nursery.join(cancel_after=None):
wait for all tasks to join, cancel the remaining ones after the given timeout
Hmm, I've never needed that particular pattern, but it's actually pretty easy to express. I didn't go into it in this writeup, but: because nurseries need to be able to cancel their contents in order to unwind the stack during exception propagation, they need to enclose their contents in a cancel scope. And since they have this cancel scope anyway, we expose it on the nursery object. And cancel scopes allow you to adjust their deadline. So if you write: async with trio.open_nursery() as nursery: ... blah blah ... # Last line before exiting the block and triggering the implicit join(): nursery.cancel_scope.deadline = trio.current_time() + TIMEOUT then it'll give you the semantics you're asking about. There could be more sugar for this if it turns out to be useful. Maybe a .timeout attribute on cancel scopes that's a magic property always equal to (self.deadline - trio.current_time()), so you could do 'nursery.cancel_scope.timeout = TIMEOUT'? -n -- Nathaniel J. Smith -- https://vorpus.org
Now there's a PEP I'd like to see. On Wed, Apr 25, 2018 at 2:24 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I just posted another essay on concurrent API design:
https://vorpus.org/blog/notes-on-structured-concurrency-or- go-statement-considered-harmful/
This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it.
(Guido: this is the one you should read :-). Or if it's too much, you can jump to the conclusion [1], and I'm happy to come find you somewhere with a whiteboard, if that'd be helpful!)
-n
[1] https://vorpus.org/blog/notes-on-structured-concurrency-or- go-statement-considered-harmful/#conclusion
-- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
On Wed, Apr 25, 2018 at 9:43 PM, Guido van Rossum <guido@python.org> wrote:
Now there's a PEP I'd like to see.
Which part? -n -- Nathaniel J. Smith -- https://vorpus.org
Adding nurseries to asyncio (or wherever in the stdlib they fit -- if they can be independent from asyncio and shared between asyncio and trio, all the better). On Thu, Apr 26, 2018 at 10:03 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Apr 25, 2018 at 9:43 PM, Guido van Rossum <guido@python.org> wrote:
Now there's a PEP I'd like to see.
Which part?
-n
-- Nathaniel J. Smith -- https://vorpus.org
-- --Guido van Rossum (python.org/~guido)
My 2c after careful reading: restarting tasks automatically (custom nursery example) is quite questionable: * it's unexpected * it's not generally safe (argument reuse, side effects) * user's coroutine can be decorated to achieve same effect I'd say just remove this, it's not relevant to your thesis. It's very nice to have the escape hatch of posting tasks to "someone else's" nursery. I feel there are more caveats to posting a task to parent's or global nursery though. Consider that local tasks typically await on other local tasks. What happens when N1-task1 waits on N2-task2 and N2-task9 encounters an error? My guess is N2-task2 is cancelled, which by default cancels N1-task1 too, right? That kinda break the abstraction, doesn't it? If the escape hatch is available, how about allowing tasks to be moved between nurseries? Is dependency inversion allowed? (as in given parent N1 and child N1.N2, can N1.N2.t2 await on N1.t1 ?) If that's the case, I guess it's not a "tree of tasks", as in the graph is arbitrary, not DAG. I've seen [proprietary] strict DAG task frameworks. while they are useful to e.g. perform sub-requests in parallel, they are not general enough to be useful at large. Thus I'm assuming trio does not enforce DAG... Finally, slob programmers like me occasionally want fire-and-forget tasks, aka daemonic threads. Some are long-lived, e.g. "battery status poller", others short-lived, e.g. "tail part of low-latency logging". Obv., a careful programmer would keep track of those, but we want things simple :) Perhaps in line with batteries included principle, trio could include a standard way to accomplish that? Thanks again for the great post! I think you could publish an article on this, it would be good to have wider discussion, academic, ES6, etc. d. On 25 April 2018 at 17:24, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I just posted another essay on concurrent API design:
https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it.
(Guido: this is the one you should read :-). Or if it's too much, you can jump to the conclusion [1], and I'm happy to come find you somewhere with a whiteboard, if that'd be helpful!)
-n
[1] https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-cons...
-- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
On Thu, Apr 26, 2018 at 7:55 PM, Dima Tisnek <dimaqq@gmail.com> wrote:
My 2c after careful reading:
restarting tasks automatically (custom nursery example) is quite questionable: * it's unexpected * it's not generally safe (argument reuse, side effects) * user's coroutine can be decorated to achieve same effect
It's an example of something that a user could implement. I guess if you go to the trouble of implementing this behavior, then it is no longer unexpected and you can also cope with handling the edge cases :-).There may be some reason why it turns out to be a bad idea specifically in the context of Python, but it's one of the features that's famously helpful for making Erlang work so well, so it seemed worth mentioning.
It's very nice to have the escape hatch of posting tasks to "someone else's" nursery. I feel there are more caveats to posting a task to parent's or global nursery though. Consider that local tasks typically await on other local tasks. What happens when N1-task1 waits on N2-task2 and N2-task9 encounters an error? My guess is N2-task2 is cancelled, which by default cancels N1-task1 too, right? That kinda break the abstraction, doesn't it?
"Await on a task" is not a verb that Trio has. (We don't even have task objects, except in some low-level plumbing/introspection APIs.) You can do 'await queue.get()' to wait for another task to send you something, but if the other task gets cancelled then the data will just... never arrive. There is some discussion here of moving from a queue.Queue-like model to a model with separate send- and receive-channels: https://github.com/python-trio/trio/issues/497 If we do this (which I suspect we will), then probably the task that gets cancelled was holding the only reference to the send-channel (or even better, did 'with send_channel: ...'), so the channel will get closed, and then the call to get() will raise an error which it can handle or not... But yes, you do need to spend some time thinking about what kind of task tree topology makes sense for your problem. Trio can give you tools but it's not a replacement for thoughtful design :-).
If the escape hatch is available, how about allowing tasks to be moved between nurseries?
That would be possible (and in fact there's one special case internally where we do it!), but I haven't seen a good reason yet to implement it as a standard feature. If someone shows up with use cases then we could talk about it :-).
Is dependency inversion allowed? (as in given parent N1 and child N1.N2, can N1.N2.t2 await on N1.t1 ?) If that's the case, I guess it's not a "tree of tasks", as in the graph is arbitrary, not DAG.
See above re: not having "wait on a task" as a verb.
I've seen [proprietary] strict DAG task frameworks. while they are useful to e.g. perform sub-requests in parallel, they are not general enough to be useful at large. Thus I'm assuming trio does not enforce DAG...
The task tree itself is in fact a tree, not a DAG. But that tree doesn't control which tasks can talk to each other. It's just used for exception propagation, and for enforcing that all children have to finish before the parent can continue. (Just like how in a regular function call, the caller stops while the callee is running.) Does that help?
Finally, slob programmers like me occasionally want fire-and-forget tasks, aka daemonic threads. Some are long-lived, e.g. "battery status poller", others short-lived, e.g. "tail part of low-latency logging". Obv., a careful programmer would keep track of those, but we want things simple :) Perhaps in line with batteries included principle, trio could include a standard way to accomplish that?
Well, what semantics do you want? If the battery status poller crashes, what should happen? If the "tail part of low-latency logging" command is still running when you go to shut down, do you want to wait a bit for it to finish, or cancel it, or ...? You can certainly implement some helper like: async with open_throwaway_nursery() as throwaway_nursery: # If this crashes, we ignore the problem, maybe log it or something throwaway_nursery.start_soon(some_fn) ... # When we exit the with block, it gets cancelled ... if that's what you want. Before adding anything like this to trio itself though I'd like to see some evidence of how it's being used in real-ish projects.
Thanks again for the great post! I think you could publish an article on this, it would be good to have wider discussion, academic, ES6, etc.
Thanks for the vote of confidence :-). And, we'll see... -n -- Nathaniel J. Smith -- https://vorpus.org
participants (5)
-
Andrew Svetlov
-
Antoine Pitrou
-
Dima Tisnek
-
Guido van Rossum
-
Nathaniel Smith