Cofunctions/yield from -> fibers

The original proposal for introducing 'yield from' (which is still in PEP 380's name) was to delegate a part of generator's work to another generator. However, in later discussions, the focus totally shifted to cooperative multitasking. In the example code Greg has given in http://mail.python.org/pipermail/python-ideas/2010-August/007927.html , there's not a single use case for delegating! 'yield from's essentially replace normal function calls! All Greg uses this stuff for is to manually switch between `threads' simulating individual customers! The _programming_logic_ is plain function calls, yields/cocalls are just a crutch (and quite an ugly one) to release a time slot. So, in fact, this all `yield from/coroutine' effort is an attempt to unseparatably mix two very loosely-related subjects: - code delegation (sequence in which sequential/closely-related code is executed) - execution scheduling (sequence in which parallel/loosely-related/unrelated code is executed) in the form 'a function call _may_be_ a scheduling event at the same time'! That's why it all feels so `clumsy/weird/twisted/fallacious' ! Cooperative threads must be switched somehow but choosing such a quirky and implicit technique for that is completely against Python Zen (violates about half of the items :^) )! Instead, if it's cooperative multitasking you play with, switching must be independent from other activity and as explicit as possible. There's a technique just for that called 'fibers' (MS fiber API: http://msdn.microsoft.com/en-us/library/ms682661.aspx ). In short: - ConvertThreadToFiber() decorates current thread as an initial fiber - CreateFiber (*function, *parameter)->ID creates a new fiber - SwitchToFiber(ID) switches execution to another fiber. - ExitFiber() exits a fiber. In python, the function may just return as in threads As the goal of fibers is the same as threads, it's reasonable to derive knowledge from there too, maybe duplicate the interface where applicable. And as cofunctions, these Fibers are just what http://en.wikipedia.org/wiki/Coroutine describes. With interacting coroutines, there is no execution stack anymore: there are 'calls' but no 'returns'! It's essentially like passing messages back and forth as in win32k. `Yield from's are still valid. But only as code delegation technique, i.e. a shortcut to `for i in subgen(): yield i'. The latter statement looks brief enough for me to discard the proposal altogether. Send() and stuff doesn't look to fit in what generators were originally intended to do - producing sequences with arbitrary or unknown length.

On 13/08/10 13:02, Ivan Pozdeev wrote:
I still regard delegation as an important use case for yield-from. I haven't dwelled much on delegation of plain value-yielding generators in my examples because it seems rather obvious and straightforward. My coroutine examples are meant to show the motivation for some of the less-obvious aspects of yield-from, such as the ability to return a value.
That seems like a strange statement to me. There is delegation going on all over the place in all my examples. Whenever a generator in one of my coroutine uses yield-from to call another one, it's delegating some of the work of that coroutine to that generator. The whole point of yield-from, and, to an even greater extent, cofunctions, is to make delegation between suspendable functions look as similar as possible to delegation between ordinary ones.
I'm confused about what you're asking for. If you're complaining that it's weird having to write 'yield from' or 'cocall' instead of a plain function call, then I actually agree with you. I'm trying to move cofunctions in the direction of eliminating such call markers completely, but I'm meeting quite a bit of resistance. If you're saying that switching between coroutines should involve explicitly nominating which coroutine to run next, it would be straightforward to write a scheduler that works this way. In fact, I can write one right now, it would look something like this: def switch_to_fibre(x): global current_fibre current_fibre = x def main_loop(): while 1: next(current_fibre) However, it seems to me that a function such as switch_to_fibre() is mainly useful as a primitive on which to build higher-level scheduling strategies. It would be possible to reformulate all of my example schedulers to be based on such a primitive, but I'm not sure there would be any point in doing so. Generators already have their own primitive notion of "run this one next", i.e. calling next() on them, and it seems to be quite sufficient.
But it only delegates next(), not send(), throw() or close(). If those things are considered important for single-function generators to have, then they are presumably equally important for generators that are spread over more than one function. Also, yield-from is *much* more efficient than the equivalent for-loop -- less than 10% of the overhead in my current implementation. -- Greg

On Fri, Aug 13, 2010 at 3:07 PM, geremy condra <debatem1@gmail.com> wrote:
Yes, it would be nice if PEP 380's generator delegation forest didn't get lost in the cofunction trees :) I think the cofunction discussion suggests that there are some very different possible answers as to what is the scheduler's responsibility and what is the responsibility of the individual coroutines. Picking one of them as a winner by blessing it with syntax seems rather premature. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 8/13/2010 1:35 AM, Nick Coghlan wrote:
Yes, I think even something as trivial as an example in-order iteration over a binary tree should be included since it is accessible and the benefits of readability, efficiency, and correctness are apparent: class BinaryTree: def __init__(self, left=None, us=None, right=None): self.left = left self.us = us self.right = right def __iter__(self): if self.left: yield from self.left if self.us: yield self.us if self.right: yield from self.right You can point out that "yield from" prevents a recursion depth problem while also being agnostic to whether left/right is also a BinaryTree object (e.g., a tuple or list or some other user-defined type works just as well as an iterable leaf) -- a feat that would be rather complicated and verbose otherwise. As a bonus, the run-time of such an iteration is faster due to the flattening optimization that is only possible with special syntax. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

On 8/13/2010 4:11 AM, Greg Ewing wrote:
That's not actually true -- a for-loop would work with any iterable node object just as well.
I agree that it works, but it does not avoid the recursion-depth problem and you pay for the decent through all of the for-loops. I assume we are both talking about an implementation __iter__() like: def __iter__(self): if self.left: for v in self.left: yield v if self.us: yield self.us if self.right: for v in self.right: yield v But this fails with a recursion depth RuntimeError around a depth of ~1000 or so. Perhaps that is not an interesting practical problem. But, my comment about it being "complicated and verbose" was that to avoid that depth problem, the obvious solution is to make __iter__() implement an in-order stack itself and manufacturer some way to deal with other types being in the tree. But again, due to how deep of a structure you need, perhaps it's not that interesting. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

On Fri, Aug 13, 2010 at 5:54 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A PEP 342 based scheduler requires coroutines that implement the generator API (i.e. ones that support send/throw/close) but you're claiming that it is acceptable to refer to an ordinary iterator as a coroutine and use other channels (such as module globals) to communicate requests to the scheduler. That's the only point I'm objecting to - I want to see the ability to receive values and exceptions from outside at suspension points as part of any defined coroutine API. Part of my goal in that is to emphasise that coroutines are *not* used in the same way as iterators, and hence have a wider API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
No, I don't use module globals to communicate requests to the scheduler, I use function calls or coroutine calls. For example, where a PEP 342 coroutine would do something like yield WaitForSocket(sock) I would instead do yield from wait_for_socket(sock) or cocall wait_for_socket(sock) The implementation of wait_for_socket() may make use of module globals internal to the scheduler to keep track of which coroutines are waiting for which sockets, but that's a detail private to the scheduler.
The *ability* is there by virtue of the fact that they *can* implement those methods if they want. You seem to be going further and insisting that they *must* implement those methods, even if the implementations are empty or trivial. I still can't see how that's a good thing.
Part of my goal in that is to emphasise that coroutines are *not* used in the same way as iterators,
If you really wanted to do that, you would have to give them a different API altogether, such as using resume() instead of next(). -- Greg

On Sat, Aug 14, 2010 at 11:54 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
By scheduler, I mean the actual dispatch loop, not just any code that happens to live in the same module or package.
Scheduler authors shouldn't have to pepper their code with conditional checks for send/throw/close support on the coroutines. By allowing things that only implement the iterator API without those three methods to be called "coroutines", you're invalidating that assumption, making schedulers that make it technically incorrect. If a scheduler chooses *not* to rely on PEP 342, that's fine. But with PEP 342 in place, we should respect its definition of the expected coroutine API.
The coroutine API is a superset of the iterator API, but it's still different. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
Scheduler authors shouldn't have to pepper their code with conditional checks for send/throw/close support on the coroutines.
Okay, I see what you're getting at now. We've been talking about slightly different things. I've been talking about the semantics of cocall, which, if defined in terms of yield-from without further qualification, inherits its fallback behaviour in the absence of a full set of generator methods. However, you're talking about the agreed-upon interface between a scheduler and the objects that it schedules. That's a matter for the scheduler author do decide upon and document. It may well be that some schedulers will require some or all of the generator methods to be implemented, but I expect there to be a substantial class that don't. The wording in the PEP is only meant to specify the language-mandated requirements for an object to be used with cocall. If both yield-from and cocall are to exist, it's simplest to use the same semantics for both. Making the language definition any more complicated would require a fairly strong justification, and I'm not convinced that this qualifies. For the convenience of schedulers, functions could be provided for sending, throwing and closing that provide the same fallback behaviour as yield-from and cocall. -- Greg

On Mon, Aug 16, 2010 at 8:46 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Ah, OK. To avoid anyone else being tempted to overspecify, perhaps a comment in parentheses to point out that some coroutine schedulers may require a broader coroutine API, such as that of PEP 342? I don't think this is really common enough to justify yet more functions. Although, if you added a coroutines module, you could put costart and those helper functions in there rather than making them builtins. Such a module would also make sense for the decorator/function based alternative proposal. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 13/08/10 13:02, Ivan Pozdeev wrote:
I still regard delegation as an important use case for yield-from. I haven't dwelled much on delegation of plain value-yielding generators in my examples because it seems rather obvious and straightforward. My coroutine examples are meant to show the motivation for some of the less-obvious aspects of yield-from, such as the ability to return a value.
That seems like a strange statement to me. There is delegation going on all over the place in all my examples. Whenever a generator in one of my coroutine uses yield-from to call another one, it's delegating some of the work of that coroutine to that generator. The whole point of yield-from, and, to an even greater extent, cofunctions, is to make delegation between suspendable functions look as similar as possible to delegation between ordinary ones.
I'm confused about what you're asking for. If you're complaining that it's weird having to write 'yield from' or 'cocall' instead of a plain function call, then I actually agree with you. I'm trying to move cofunctions in the direction of eliminating such call markers completely, but I'm meeting quite a bit of resistance. If you're saying that switching between coroutines should involve explicitly nominating which coroutine to run next, it would be straightforward to write a scheduler that works this way. In fact, I can write one right now, it would look something like this: def switch_to_fibre(x): global current_fibre current_fibre = x def main_loop(): while 1: next(current_fibre) However, it seems to me that a function such as switch_to_fibre() is mainly useful as a primitive on which to build higher-level scheduling strategies. It would be possible to reformulate all of my example schedulers to be based on such a primitive, but I'm not sure there would be any point in doing so. Generators already have their own primitive notion of "run this one next", i.e. calling next() on them, and it seems to be quite sufficient.
But it only delegates next(), not send(), throw() or close(). If those things are considered important for single-function generators to have, then they are presumably equally important for generators that are spread over more than one function. Also, yield-from is *much* more efficient than the equivalent for-loop -- less than 10% of the overhead in my current implementation. -- Greg

On Fri, Aug 13, 2010 at 3:07 PM, geremy condra <debatem1@gmail.com> wrote:
Yes, it would be nice if PEP 380's generator delegation forest didn't get lost in the cofunction trees :) I think the cofunction discussion suggests that there are some very different possible answers as to what is the scheduler's responsibility and what is the responsibility of the individual coroutines. Picking one of them as a winner by blessing it with syntax seems rather premature. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 8/13/2010 1:35 AM, Nick Coghlan wrote:
Yes, I think even something as trivial as an example in-order iteration over a binary tree should be included since it is accessible and the benefits of readability, efficiency, and correctness are apparent: class BinaryTree: def __init__(self, left=None, us=None, right=None): self.left = left self.us = us self.right = right def __iter__(self): if self.left: yield from self.left if self.us: yield self.us if self.right: yield from self.right You can point out that "yield from" prevents a recursion depth problem while also being agnostic to whether left/right is also a BinaryTree object (e.g., a tuple or list or some other user-defined type works just as well as an iterable leaf) -- a feat that would be rather complicated and verbose otherwise. As a bonus, the run-time of such an iteration is faster due to the flattening optimization that is only possible with special syntax. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

On 8/13/2010 4:11 AM, Greg Ewing wrote:
That's not actually true -- a for-loop would work with any iterable node object just as well.
I agree that it works, but it does not avoid the recursion-depth problem and you pay for the decent through all of the for-loops. I assume we are both talking about an implementation __iter__() like: def __iter__(self): if self.left: for v in self.left: yield v if self.us: yield self.us if self.right: for v in self.right: yield v But this fails with a recursion depth RuntimeError around a depth of ~1000 or so. Perhaps that is not an interesting practical problem. But, my comment about it being "complicated and verbose" was that to avoid that depth problem, the obvious solution is to make __iter__() implement an in-order stack itself and manufacturer some way to deal with other types being in the tree. But again, due to how deep of a structure you need, perhaps it's not that interesting. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

On Fri, Aug 13, 2010 at 5:54 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A PEP 342 based scheduler requires coroutines that implement the generator API (i.e. ones that support send/throw/close) but you're claiming that it is acceptable to refer to an ordinary iterator as a coroutine and use other channels (such as module globals) to communicate requests to the scheduler. That's the only point I'm objecting to - I want to see the ability to receive values and exceptions from outside at suspension points as part of any defined coroutine API. Part of my goal in that is to emphasise that coroutines are *not* used in the same way as iterators, and hence have a wider API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
No, I don't use module globals to communicate requests to the scheduler, I use function calls or coroutine calls. For example, where a PEP 342 coroutine would do something like yield WaitForSocket(sock) I would instead do yield from wait_for_socket(sock) or cocall wait_for_socket(sock) The implementation of wait_for_socket() may make use of module globals internal to the scheduler to keep track of which coroutines are waiting for which sockets, but that's a detail private to the scheduler.
The *ability* is there by virtue of the fact that they *can* implement those methods if they want. You seem to be going further and insisting that they *must* implement those methods, even if the implementations are empty or trivial. I still can't see how that's a good thing.
Part of my goal in that is to emphasise that coroutines are *not* used in the same way as iterators,
If you really wanted to do that, you would have to give them a different API altogether, such as using resume() instead of next(). -- Greg

On Sat, Aug 14, 2010 at 11:54 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
By scheduler, I mean the actual dispatch loop, not just any code that happens to live in the same module or package.
Scheduler authors shouldn't have to pepper their code with conditional checks for send/throw/close support on the coroutines. By allowing things that only implement the iterator API without those three methods to be called "coroutines", you're invalidating that assumption, making schedulers that make it technically incorrect. If a scheduler chooses *not* to rely on PEP 342, that's fine. But with PEP 342 in place, we should respect its definition of the expected coroutine API.
The coroutine API is a superset of the iterator API, but it's still different. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
Scheduler authors shouldn't have to pepper their code with conditional checks for send/throw/close support on the coroutines.
Okay, I see what you're getting at now. We've been talking about slightly different things. I've been talking about the semantics of cocall, which, if defined in terms of yield-from without further qualification, inherits its fallback behaviour in the absence of a full set of generator methods. However, you're talking about the agreed-upon interface between a scheduler and the objects that it schedules. That's a matter for the scheduler author do decide upon and document. It may well be that some schedulers will require some or all of the generator methods to be implemented, but I expect there to be a substantial class that don't. The wording in the PEP is only meant to specify the language-mandated requirements for an object to be used with cocall. If both yield-from and cocall are to exist, it's simplest to use the same semantics for both. Making the language definition any more complicated would require a fairly strong justification, and I'm not convinced that this qualifies. For the convenience of schedulers, functions could be provided for sending, throwing and closing that provide the same fallback behaviour as yield-from and cocall. -- Greg

On Mon, Aug 16, 2010 at 8:46 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Ah, OK. To avoid anyone else being tempted to overspecify, perhaps a comment in parentheses to point out that some coroutine schedulers may require a broader coroutine API, such as that of PEP 342? I don't think this is really common enough to justify yet more functions. Although, if you added a coroutines module, you could put costart and those helper functions in there rather than making them builtins. Such a module would also make sense for the decorator/function based alternative proposal. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
geremy condra
-
Greg Ewing
-
Ivan Pozdeev
-
Nick Coghlan
-
Scott Dial