Minimal 'stackless' PEP using generators?
I just read the thread 'Stackless Python' in June 2004 on python-dev and was wondering if you'd comment on a simpler cooperative mechanism, via a small hack to generators: 1. The PEP would introduce a new 'builtin' class called 'Cooperate' 2. Generator semantics would be altered so that 'yield X', where X is an instance of Cooperate, would automagically propigate to the outer-most non-generator. In for example, >>> def MyCooperate(magic.Cooperate): >>> __str__(self): return "cooperate" >>> >>> def top(): >>> yield "one" >>> yield MyCooperate() >>> yield "two" >>> >>> def middle(): >>> """ intermediate generator _only_ sees one and two """ >>> for x in top(): >>> print "middle", x >>> yield x >>> >>> def lower(): >>> """ this is not a generator, so it sees cooperate """ >>> for x in middle(): >>> print "lower", x >>> >>> lower() middle one lower one lower cooperate middle two lower two With these two changes, the "lower" function could be an async reactor like those found in Twisted, or async core. While the result isn't true coutines, it would be a huge improvement for those who would like to do async coding. I've done something similar with Twisted called Flow [1] and it works well, with the exception of being a painful syntax hack and being quite slow. If this was moved into Python's core, we'd get most of the advantages of coroutines without the full cost. Thoughts? Clark [1] http://twistedmatrix.com/documents/current/howto/flow -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
At 11:39 AM 8/23/04 -0400, Clark C. Evans wrote:
I just read the thread 'Stackless Python' in June 2004 on python-dev and was wondering if you'd comment on a simpler cooperative mechanism, via a small hack to generators:
1. The PEP would introduce a new 'builtin' class called 'Cooperate'
2. Generator semantics would be altered so that 'yield X', where X is an instance of Cooperate, would automagically propigate to the outer-most non-generator.
Perhaps you mean "inner-most"?
With these two changes, the "lower" function could be an async reactor like those found in Twisted, or async core. While the result isn't true coutines, it would be a huge improvement for those who would like to do async coding. I've done something similar with Twisted called Flow [1] and it works well, with the exception of being a painful syntax hack and being quite slow. If this was moved into Python's core, we'd get most of the advantages of coroutines without the full cost.
Thoughts?
It doesn't seem to me to actually help anything. You can already do this using a simple wrapper object that maintains a stack of active generators, as I do in 'peak.events'. I was hoping that you had actually come up with a solution for the more complex problem of suspending *non* generator functions, in a way that would work with CPython. :(
On Mon, Aug 23, 2004 at 11:56:04AM -0400, Phillip J. Eby wrote: | >1. The PEP would introduce a new 'builtin' class called 'Cooperate' | > | >2. Generator semantics would be altered so that 'yield X', where X | > is an instance of Cooperate, would automagically propigate to | > the outer-most non-generator. | | Perhaps you mean "inner-most"? Yes. The top-most non-generator on the stack. | It doesn't seem to me to actually help anything. You can already do this | using a simple wrapper object that maintains a stack of active | generators, as I do in 'peak.events'. Could you provide an example? The problem this proposal solves is straight-foward -- it is tedious and slow to have intermediate generators do stuff like: def middle(): """ intermediate generator _only_ sees one and two """ for x in top(): ! if isinstance(x,X): ! yield x print "middle", x yield x This extra step is tedious and also slow; especially if one has lots of yield statements that cooperate. It could be standardized and made a bit snappier if it was built-in behavior. This is an 80/5 proposal. One gets 80% of the happiness, with 5% of the pain. | I was hoping that you had actually come up with a solution for the more | complex problem of suspending *non* generator functions, in a way that | would work with CPython. :( Yes, I know. I'm trying to avoid this much harder problem. Clark
At 12:34 PM 8/23/04 -0400, Clark C. Evans wrote:
On Mon, Aug 23, 2004 at 11:56:04AM -0400, Phillip J. Eby wrote: | It doesn't seem to me to actually help anything. You can already do this | using a simple wrapper object that maintains a stack of active | generators, as I do in 'peak.events'.
Could you provide an example? The problem this proposal solves is straight-foward -- it is tedious and slow to have intermediate generators do stuff like:
def middle(): """ intermediate generator _only_ sees one and two """ for x in top(): ! if isinstance(x,X): ! yield x print "middle", x yield x
This extra step is tedious and also slow; especially if one has lots of yield statements that cooperate.
'peak.events' uses "Task" objects that maintain a stack of active generators. The Task receives yields from the "innermost" generator directly, without them being passed through by intermediate generators. If the value yielded is *not* a control value, the Task object pops the generator stack, and resumes the previously suspended generator. A "magic" function, 'events.resume()' retrieves the value from the Task inside the stopped generator. Basically, this mechanism doesn't pass control values through multiple tests and generator frames: control values are consumed immediately by the Task. This makes it easy to suspend nested generators while waiting for some event, such as socket readability, a timeout, a Twisted "Deferred", etc. Yielding an "event" object like one of the aforementioned items causes the Task to return to its caller (the event loop) after requesting a callback for the appropriate event. When the callback re-invokes the thread, it saves the value associated with the event, if any, for 'events.resume()' to retrieve when the topmost generator is resumed. Also, 'events.resume()' supports passing errors from one generator to the next, so that it's "as if" the generators execute in a nested fashion. The drawback is that you must invoke events.resume() after each yield, but this is *much* less intrusive than requiring generators to pass through results from all nested generators. Take a look at: http://cvs.eby-sarna.com/PEAK/src/peak/events/ In particular, the 'interfaces' and 'event_threads' modules. Here's a usage example, a simple Task procedure: @events.taskFactory def monitorBusy(self): # get a "readable" event on this socket untilReadable = self.eventLoop.readable(self) while True: # Wait until we have stream activity yield untilReadable; events.resume() # Is everybody busy? if self.busyCount()==self.childCount(): self.supervisor.requestStart() # Wait until the child or busy count changes before proceeding yield events.AnyOf(self.busyCount,self.childCount); events.resume() This task waits until a listener socket is readable (i.e. an incoming connection is pending), and then asks the process supervisor to start more processes if all the child processes are busy. It then waits until either the busy count or the child process count changes, before it waits for another incoming connection. Basically, if you're invoking a sub-generator, you do: yield subGenerator(arguments); result=events.resume() This is if you're calling a sub-generator that only returns one "real" result. You needn't worry about passing through control values, because the current generator won't be resumed until the subgenerator yields a non-control value. If you're invoking a sub-generator that you intend to *iterate over*, however, and that generator can suspend on events, it's a bit more complex: iterator = subGenerator(arguments) while True: yield iterator; nextItem = events.resume() if nextItem is NOT_GIVEN: # sentinel value break # body of loop goes here, using 'nextItem' This is not very convenient, but I don't find it all that common to have data I'm iterating over in such a fashion, because 'peak.events' programs tends to have "infinite" streams that are organized as event sources in "pipe and filter" fashion. So, you tend to end up with Tasks that only have one generator running anyway, except for things that are more like "subroutines" than real generators, because you only expect one real return value from them, anyway. peak.events can work with Twisted, by the way, if you have it installed. For example, this: yield aDeferred; result = events.resume() suspends the generator until the Deferred fires, and then the result will be placed in 'result' upon resumption of the generator. If the Deferred triggers an "errback", the call to 'events.resume()' will reraise the error, inside the current generator. It would be nice if there were some way to "accept" data and exceptions within a generator that didn't require the 'events.resume' hack, e.g.: result = yield aDeferred would be really nice, especially if 'result' could cause an exception to be raised. I was hoping that this was something along the lines of what you were proposing. E.g. if generator-iterators could take arguments to 'next()' that would let you do this. I believe there's already a rejected PEP covering the issue of communicating "into" generators. Perhaps there should be a "simple coroutines" PEP, that doesn't try to extend generators into coroutines, but instead treats coroutines as a first-class animal that just happens to be implemented using some of the same techniques "under the hood".
On Mon, Aug 23, 2004 at 01:18:28PM -0400, Phillip J. Eby wrote: | It would be nice if there were some way to "accept" data and exceptions | within a generator that didn't require the 'events.resume' hack, e.g.: | | result = yield aDeferred | | would be really nice, especially if 'result' could cause an exception to | be raised. I was hoping that this was something along the lines of what | you were proposing. Perhaps it would be nice to add an alternative syntax to call a generator when you are expecting exactly one value. def generator(): yield 'one value' def consumer(): value = generator() This, when combined with the previous proposal would give: >>> def top(): >>> yield cooperate >>> yield "one value" >>> >>> def middle(): >>> """ intermediate generator _only_ sees 'one value' """ >>> bing = top() >>> # do something with bing >>> yield bing >>> >>> def lower(): >>> """ this is not a generator, so it sees cooperate """ >>> for x in middle(): >>> print x >>> >>> lower() cooperate one value | Perhaps there should be a "simple coroutines" PEP, that doesn't try to | extend generators into coroutines, but instead treats coroutines as a | first-class animal that just happens to be implemented using some of the | same techniques "under the hood". The problem is maintaining a 'stack' of generators requires that intermediate generators in the stack need a tedious hack (in both peek.event and twisted.flow) for them to work. If we can have python allow messages to be sent from the top-most generator to the inner-most non-generator (via yield cooperate, or some other magic), this difficulty is resolved -- intermediate generators can then be written in a operational style, without icky hacks for managing deferred execution. Full-blown corountines arn't necessary. A small tweak to generators will do. Best, Clark
At 02:33 PM 8/23/04 -0400, Clark C. Evans wrote:
Perhaps it would be nice to add an alternative syntax to call a generator when you are expecting exactly one value.
def generator(): yield 'one value' def consumer(): value = generator()
We have that today: value, = generator() Or do I misunderstand you?
Full-blown corountines arn't necessary. A small tweak to generators will do.
I don't think this is true. Your hypothetical example can't resume 'top()' after it yields the "co-operate" control value, unless it either it *has* a stack of generators, or the Python core somehow maintains a stack of the executing generators. So, my point was that since this can already be done in user-level code with a stack of generators, I don't see the point to adding a facility to Python to create hidden stacks of generators. Instead, it would be more useful to get rid of the one piece that really is "magic", by providing a way to pass values or exceptions into a running generator. My comment about "coroutines" was more that Guido previously expressed distaste for adding such a communication mechanism to generators as abusing the concept of a generator just being a way to implement complex iterators. Therefore, I thought a coroutine proposal (backed by a suitable syntax and an implementation plan) might have more success.
On Mon, Aug 23, 2004 at 02:59:18PM -0400, Phillip J. Eby wrote: | > def generator(): | > yield 'one value' | value, = generator() Oh, cool. ;) | > Full-blown corountines arn't necessary. A small tweak to | > to generators will do. | | I don't think this is true. Your hypothetical example can't resume | 'top()' after it yields the "co-operate" control value, unless it either | it *has* a stack of generators, or the Python core somehow maintains a | stack of the executing generators. Yes. The idea is for Python to maintain a stack of generators, and when ever the top-most generator yields <magic> this yield skips the intermediate generators on the stack, and goes immediately to the calling function. | So, my point was that since this can already be done in user-level code | with a stack of generators, I don't see the point to adding a facility to | Python to create hidden stacks of generators. The problem is, if you maintain the stack of generators in userland, then each generator on the stack must do special stuff to handle 'cooperate' (in one form or another). You demonstrated an equivalent problem with your peek.event code presented earlier. While, if this is built into the core, intermediate generators can be happily oblivious to the cooperative magic. | Instead, it would be more useful to get rid of the one piece that really | is "magic", by providing a way to pass values or exceptions into a | running generator. I don't like this either. Clark
At 03:18 PM 8/23/04 -0400, Clark C. Evans wrote:
On Mon, Aug 23, 2004 at 02:59:18PM -0400, Phillip J. Eby wrote: | > def generator(): | > yield 'one value' | value, = generator() Oh, cool. ;)
| > Full-blown corountines arn't necessary. A small tweak to | > to generators will do. | | I don't think this is true. Your hypothetical example can't resume | 'top()' after it yields the "co-operate" control value, unless it either | it *has* a stack of generators, or the Python core somehow maintains a | stack of the executing generators.
Yes. The idea is for Python to maintain a stack of generators, and when ever the top-most generator yields <magic> this yield skips the intermediate generators on the stack, and goes immediately to the calling function.
That's not the part I mean. I'm talking about *after* that happens, when you *resume* the generators, and you'll need to reconstruct the stack as it stood when the generators were interrupted. (Your original proposal doesn't actually mention this part at all, but I assume it's necessary. After all, if all you needed was a way to stop a bunch of generators in their tracks, raising an exception would suffice.)
| So, my point was that since this can already be done in user-level code | with a stack of generators, I don't see the point to adding a facility to | Python to create hidden stacks of generators.
The problem is, if you maintain the stack of generators in userland, then each generator on the stack must do special stuff to handle 'cooperate' (in one form or another). You demonstrated an equivalent problem with your peek.event code presented earlier. While, if this is built into the core, intermediate generators can be happily oblivious to the cooperative magic.
Though the problems are "equivalent" in complexity, the difficulty of their respective solutions are not at all comparable, because your approach either requires a technique for dynamically reconstructing the stack, or some similar mechanism for generators to keep track of the state of *other* generators when the containing generator is interrupted.
My comment about "coroutines" was more that Guido previously expressed distaste for adding such a communication mechanism to generators as abusing the concept of a generator just being a way to implement complex iterators.
This makes me think that maybe we want another kind of object, similar to a generator, but designed to be used for side effects rather than to produce values. For want of a better term, let's call it a "cooperator" (so it ends in "ator" like "generator" :-). Actually there will be two objects, a cooperator-function (analogous to a generator-function) and a cooperator-instance (analogous to a generator-iterator). Calling a cooperator-function will return a cooperator-instance, which will obey the cooperator protocol. This protocol consists of a single function run(), which causes the cooperator-instance to perform some amount of work and then stop. If there is more work remaining to be done, run() returns a true value, otherwise it returns a false value. There will be a new statement: suspend for use inside a cooperator-function. The presence of this statement is what distinguishes a cooperator-function from an ordinary function (just as the presence of 'yield' distinguishes a generator from an ordinary function). Execution of 'suspend' will cause the run() method to return with a true value. The next time run() is called, execution will resume after the 'suspend'. This is really all that's needed, but one further piece of syntax may be desirable to make it easier for one cooperator to invoke another: do another_coop() would be equivalent to co = another_coop() while co.run(): suspend (Because it implies a 'suspend', the presence of 'do' would also mark a function as a cooperator-function). Something to note is that there's no requirement for a cooperator- instance to be implemented by a cooperator-function, just as an iterator can be implemented in ways other than a generator. This sidesteps some of the difficulties of mixing Python and C calls, since a Python cooperator-function could invoke a cooperator implemented in C which in turn invokes another Python cooperator- function, etc. The only requirement is that all the objects on the way down obey the cooperator protocol. The drawback is that you always have to know when you're calling another cooperator and use 'do'. But that's no worse than the current situation with generators, where you always have to know you're using an iterator and use 'for... yield'. What think ye all? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
At 05:57 PM 8/24/04 +1200, Greg Ewing wrote:
What think ye all?
I don't think it's in the right direction to achieve the effects desired by Clark or me. Here's a metaphor for what would be ideal: resumable exceptions. If you could throw a special kind of exception (or even a regular exception), and call traceback.resume() to pick up execution at the point where it was thrown, whether thrown by a generator or a regular function. If you had that capability, you could implement any sort of coroutine or task-switching facilities you wanted. Without such a facility, one needs a stack of *ators to emulate it, to provide the effects desired by Clark's Flow or by peak.events. However, co-operators' "suspend" doesn't provide a way to communicate between co-operators. At least 'yield' lets you yield a value. What's really needed (IMO) is to add a way to communicate *into* a co-operator, passing it a value to "accept" or a traceback to raise. E.g.: input = suspend output Where 'output' is returned from the current 'run()', and 'input' is a value passed to the next 'run()'. Or, if there's a type/value/traceback passed to a 'throw()' method, then the 'suspend' statement should raise an error. With that facility, 'peak.events' could drop the 'events.resume()' magic function. It'd still need a stack of co-operators, and there'd still be ugliness when iterating over a generator that needed to be suspendable. But at least it would be able to have clean syntax. (Though I don't think that 'input=suspend output' is actually clean syntax, it's just better than the yield/resume thing.) Anyway, as I said, what would be *most* useful for async programming is a way to resume a traceback, because then you wouldn't need for every intervening frame to have special suspension capability.
"Phillip J. Eby"
If you could throw a special kind of exception (or even a regular exception), and call traceback.resume() to pick up execution at the point where it was thrown, whether thrown by a generator or a regular function.
Actually, it probably wouldn't be too hard to make exceptions resumable -- provided all the frames in between are Python. If the exception gets thrown through any C calls, though, resumption becomes impossible. Some other structure is needed to hold the state of a resumable C call. I agree that maintaining a stack of *ators automatically somehow would be desirable, but I haven't figured out yet exactly where and how that stack would be maintained.
What's really needed (IMO) is to add a way to communicate *into* a co-operator, passing it a value to "accept" or a traceback to raise. E.g.:
input = suspend output
There have been many suggestions in the past for 'yield' to be extended to allow values to be passed in as well as out. They all suffer from a problem of asymmetry. However you arrange it, you always end up having to discard the first value passed out or pass a dummy value in the first time, or something like that. I deliberately left communication out of the suspend to avoid those problems. If you want communication, you have to arrange it some other way.
Anyway, as I said, what would be *most* useful for async programming is a way to resume a traceback, because then you wouldn't need for every intervening frame to have special suspension capability.
Certainly, but that way, ultimately, lies Stackless... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
At 02:51 PM 8/25/04 +1200, Greg Ewing wrote:
"Phillip J. Eby"
: If you could throw a special kind of exception (or even a regular exception), and call traceback.resume() to pick up execution at the point where it was thrown, whether thrown by a generator or a regular function.
Actually, it probably wouldn't be too hard to make exceptions resumable -- provided all the frames in between are Python. If the exception gets thrown through any C calls, though, resumption becomes impossible. Some other structure is needed to hold the state of a resumable C call.
Unfortunately, as was discussed on the previous Stackless thread, nearly *all* Python bytecodes pass through C on their way to other Python code. The trick is to be able to figure out whether the return value of a given frame is being put onto the value stack of its parent frame. In principle, you could tell this from the operation being performed at the current opcode pointer in the parent frame, and the prior contents of the value stack. In practice, the eval loop sometimes messes with the value stack prior to the call (e.g. to optimize method calls), and in the case of e.g try/finally and try/except, it's going to execute other stuff in the frame before the exception is passed up to the next frame. So, to make it work,the interpreter loop would need to be particular about putting the stack back to the way it was before the current instruction began, when an exception occurs. Also, it would need to *not* invoke except/finally clauses for one of these special exceptions, or else have a way of restoring the frame to its pre-exception state, which seems arbitrarily complex. I guess it would have to treat the resumable exception as if a YIELD operation took place. I'm going to have to think about this one some more, but it seems to me that it would be much easier to add bidirectional communication to generators to create a coroutine type, than to try to implement resumable exceptions.
I'm not tied to my proposal, I'm suggesting that if the scope of continuations is limited to only work with iterators, that a solution may emerge. This limitation, in pratice with generator syntax, is not all that bothersome to an application developer. On Wed, Aug 25, 2004 at 12:13:46PM -0400, Phillip J. Eby wrote: | At 02:51 PM 8/25/04 +1200, Greg Ewing wrote: | >> If you could throw a special kind of exception (or even a regular | >> exception), and call traceback.resume() to pick up execution at the | >> point where it was thrown, whether thrown by a generator or a | >> regular function. | > | >Actually, it probably wouldn't be too hard to make exceptions | >resumable -- provided all the frames in between are Python. If the | >exception gets thrown through any C calls, though, resumption becomes | >impossible. Some other structure is needed to hold the state of a | >resumable C call. | | Unfortunately, as was discussed on the previous Stackless thread, nearly | *all* Python bytecodes pass through C on their way to other Python code. However, the C code in the distribution could be scrubbed to make sure that it works with the mechanism. And the C API could be detailed so that this case is managable; else, it is an "uncaught" exception. | I'm going to have to think about this one some more, but it seems to me | that it would be much easier to add bidirectional communication to | generators to create a coroutine type, than to try to implement resumable | exceptions. Sure; but resumable exceptions would only need to operate with generators, and there are some simple rewrite tricks one could do. If an iterator is calling a generator it could be re-written in a form that maintains state and _doesn't_ catch the exception. Already for generators 'finally' blocks are troublesome, so this doesn't change things for the worse. StopIteration has many of these same issues. On Wed, Aug 25, 2004 at 03:32:28PM +1200, Greg Ewing wrote: | > You'd definately want to mix them. For example, a 'cooperator' | > would be reading from a socket, it could do one of three things: | > | > yield the next line from the sender | > raise StopIteration if the socket closed | > raise SuspendIteration if the read() is blocked | | Hmmm. It seems that generation and cooperation are really orthogonal | concepts -- you may want something that is both a generator *and* a | cooperator. My proposal doesn't allow for this. I will have to do some | more thinking... Not only othogonal, but quite complementary. Kind Regards, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
At 01:16 PM 8/25/04 -0400, Clark C. Evans wrote:
On Wed, Aug 25, 2004 at 12:13:46PM -0400, Phillip J. Eby wrote: | At 02:51 PM 8/25/04 +1200, Greg Ewing wrote: | >> If you could throw a special kind of exception (or even a regular | >> exception), and call traceback.resume() to pick up execution at the | >> point where it was thrown, whether thrown by a generator or a | >> regular function. | > | >Actually, it probably wouldn't be too hard to make exceptions | >resumable -- provided all the frames in between are Python. If the | >exception gets thrown through any C calls, though, resumption becomes | >impossible. Some other structure is needed to hold the state of a | >resumable C call. | | Unfortunately, as was discussed on the previous Stackless thread, nearly | *all* Python bytecodes pass through C on their way to other Python code.
However, the C code in the distribution could be scrubbed to make sure that it works with the mechanism. And the C API could be detailed so that this case is managable; else, it is an "uncaught" exception.
Quick question: does Stackless' "greenlet" extension module do what you need? I just built it and played around with it a bit. One of the demos is a pseudo-generator implementation using a 'Yield' function that uses greenlet task switching to simulate the behavior of a normal generator function. It looks to me that your "yield Cooperate" scenarios could perhaps be met by simply switching to a reactor/mainloop greenlet, when it's necessary to wait for an event, and by having your event callback switch back to the task's greenlet, returning a result if needed. Interestingly, this greenlet thing looks like it would be perfect for peak.events, allowing me to get rid of all the generators and yield/resume stuff. (In fairness, Bob Ippolito told me that before I ever implemented peak.events, but at the time greenlets didn't exist as an extension module that could be built outside of Stackless.)
On Tue, 24 Aug 2004 17:57:18 +1200, Greg Ewing
My comment about "coroutines" was more that Guido previously expressed distaste for adding such a communication mechanism to generators as abusing the concept of a generator just being a way to implement complex iterators.
This makes me think that maybe we want another kind of object, similar to a generator, but designed to be used for side effects rather than to produce values.
For want of a better term, let's call it a "cooperator" (so it ends in "ator" like "generator" :-). Actually there will be two objects, a cooperator-function (analogous to a generator-function) and a cooperator-instance (analogous to a generator-iterator).
Calling a cooperator-function will return a cooperator-instance, which will obey the cooperator protocol. This protocol consists of a single function run(), which causes the cooperator-instance to perform some amount of work and then stop. If there is more work remaining to be done, run() returns a true value, otherwise it returns a false value.
Tell me if I'm missing something here, but with the exception of your two introductory keywords (a subject I thought was supposed to be sacred, with the intention of keeping the amount of keywords as small as possible,) how much difference is there to generators? Are you saying the run() method, after having been called initially, will return a value indicative of the cooperator's business when called again? It seems like a similar feat could be achieved with custom generators and boolean yield statements. Daniel Bickett
On Tue, Aug 24, 2004 at 07:45:13AM -0400, Daniel Bickett wrote: | Tell me if I'm missing something here, but with the exception of your | two introductory keywords (a subject I thought was supposed to be | sacred, with the intention of keeping the amount of keywords as small | as possible,) how much difference is there to generators? Are you | saying the run() method, after having been called initially, will | return a value indicative of the cooperator's business when called | again? In that such a 'cooperator' would keep its state on the heap (in an iterator object to be more precice), it is very much like a generator. In fact, one could implement it using a SuspendIteration exception, in addition to a StopIteration. SuspendIteration would signal that the caller of the iterator _may_ call next() again. In other words, it says: "I didn't produce a value this time, but call me again, and I may have a value for you". Specializations of SuspendIteration could signal that the process is blocked on a socket read, or some other low-level process. So that the caller (which would be a generic run-loop in most cases) would only attempt next() once the block is resolved. | It seems like a similar feat could be achieved with custom generators | and boolean yield statements. I currently do this in twisted.flow, however, it requires that intermediate generators "check" for the special value. This intermediate work is slow (since it is not in C) and tedious since it is mostly boilerplate catch-check-yield kinda stuff On Tue, Aug 24, 2004 at 02:58:42PM +0100, Moore, Paul wrote: | From: Greg Ewing | > There will be a new statement: | > | > suspend | > | [...] | > | > do another_coop() | > | > would be equivalent to | > | > co = another_coop() | > while co.run(): | > suspend | | | I'm not sure I see the difference between suspend/do and | yield True/for _ in co: pass, other than possibly that | co-operators and generators are intended to be mixed (which | strikes me as implausible). You'd definately want to mix them. For example, a 'cooperator' would be reading from a socket, it could do one of three things: yield the next line from the sender raise StopIteration if the socket closed raise SuspendIteration if the read() is blocked The 'suspend' statement is analogous to raising SuspendIteration. | I'm likely to be missing something here, but I don't follow | the semantics you are suggesting. | | If there was a simple, realistic use case for this, it might | help clarify the semantics. (For extra credit, mix co-operators | and generators in your use case so that the semantics of | interaction are shown as well :-)) There ya go. ;) Clark
"Clark C. Evans"
In fact, one could implement it using a SuspendIteration exception, in addition to a StopIteration.
I don't think this can be done quite right by raising an exception in the normal way, because you don't want 'finally' clauses to be triggered.
You'd definately want to mix them. For example, a 'cooperator' would be reading from a socket, it could do one of three things:
yield the next line from the sender raise StopIteration if the socket closed raise SuspendIteration if the read() is blocked
Hmmm. It seems that generation and cooperation are really orthogonal concepts -- you may want something that is both a generator *and* a cooperator. My proposal doesn't allow for this. I will have to do some more thinking... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
how much difference is there to generators?
Very little! Probably too little, in fact, to justify having a different kind of entity at all. I only suggested it because there seems to be a reluctance to extend generators in any way that might suggest using them for purposes other than generation. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
A few definitions: iterator: > An iterator is an object returned by an __iter__ with a next() method which produces values. An iterator's state is kept on the heap (not on the stack), thus one could consider it 'resumable'. An iterator ends when it raises an exception, the very special exception, 'StopIteration' is used to signal that the iterator is done producing values, but is not really an error condition. generator: > A syntax for making an iterator, it appears in python source code as a function or method with a 'yield' statement. An implementation is free to optimize this function, but it acts as if it had created an iterator. iterlink: > An iterlink is an association of one iterator, the consumer, to another iterator, the producer. The iterlink contains information about both iterators, and a marker to indicate the state of the consumer when it asked for the next value of the producer. iterstack: > A stack of iterlinks, where the producer of one iterlink is the consumer of another. The top iterator in the iterstack is an iterator which is behaving like a producer, but not a consumer. The bottom iterator in the iterstack is one which is a producer to a non-iterator. That is, the caller of the bottom iterator is a function or method which is not an iterator. Ok. With this in mind, I like your idea of a new keyword, 'suspend'. This keyword could be used by the top iterator in an iterstack. When 'suspend' happens, the entire iterstack, and its runstate, is setaside. An a signal is sent to the caller of the bottom iterator; the signal could be viewed as a 'resumable exception'. For example, it could be pictured as a 'SuspendIteration' exception. More concretely, using a very simple case. class TopIterator: """ def TopIterator(): yield "one" suspend yield "two" """ def _one(self): self.next = self._suspend return "one" def _suspend(self): self.next = self._two raise SuspendIteration() def _two(self): self.next = self._stop return "two" def _stop(self): raise StopIteration def __iter__(): self.next = self._one return self class BottomIterator(): """ def BottomIterator(): producer = iter(TopIterator()) saved_one = producer.next() saved_two = producer.next() yield (saved_one,saved_two) """ def _one(self): self.saved_one = self.producer.next() self.next = self._two return self.next() def _two(self): self.saved_two = self.producer.next() self.next = self._stop return (self.saved_one, self.saved_two) def _stop(self): raise StopIteration def __iter__(self): self.producer = iter(TopIterator()) self.next = self._one return self def caller(): it = iter(BottomIterator()) while True: try: print it.next() except StopIteration: break except SuspendIteration: sleep(10) continue Anyway, the behavior probably equivalent to 'resumable exceptions', however, I don't think it is _different_ item from an iterator, just a new set of behaviors triggered by the 'suspend' keyword. As long as the 'iterator construction' is automated, the 'caller()' function can be delegated to Twisted's reactor, or asynccore or any other cooperative multitasking base. The superclasses for SuspendIteration could be 'BlockForSocket', for example. On Tue, Aug 24, 2004 at 05:57:18PM +1200, Greg Ewing wrote: | This makes me think that maybe we want another kind of object, similar | to a generator, but designed to be used for side effects rather than | to produce values. (a) the generation of 'top-level' iterators using 'suspend' would be quite easy, see above (b) the generation logic for "pass-through" generators (aka BottomIterator in the example above) would have to happen for the mechanism to be truly useful; SuspendIteration would have to be a pass-through exception (c) updating itertools to allow SuspendIteration to 'passthrough' | For want of a better term, let's call it a "cooperator" (so it ends in | "ator" like "generator" :-). Actually there will be two objects, a | cooperator-function (analogous to a generator-function) and a | cooperator-instance (analogous to a generator-iterator). | | Calling a cooperator-function will return a cooperator-instance, which | will obey the cooperator protocol. This protocol consists of a single | function run(), which causes the cooperator-instance to perform some | amount of work and then stop. If there is more work remaining to be | done, run() returns a true value, otherwise it returns a false value. Well, run() need not be specified. It would be implemented by the lower-level 'reactor' code, aka the "caller" in the example | There will be a new statement: | | suspend | | for use inside a cooperator-function. The presence of this statement | is what distinguishes a cooperator-function from an ordinary function | (just as the presence of 'yield' distinguishes a generator from an | ordinary function). Execution of 'suspend' will cause the run() method | to return with a true value. The next time run() is called, execution | will resume after the 'suspend'. Right. Only that I'd allow both 'yield' and 'suspend' in the same function, or it really isn't that useful. | This is really all that's needed, but one further piece of syntax may | be desirable to make it easier for one cooperator to invoke another: | | do another_coop() | | would be equivalent to | | co = another_coop() | while co.run(): | suspend Exactly; this is where the tough part is, the 'intermediate' cooperators. Ideally, all generators (since they keep their state on the stack) would make good 'intermediate' cooperators. | Something to note is that there's no requirement for a cooperator- | instance to be implemented by a cooperator-function, just as an | iterator can be implemented in ways other than a generator. This | sidesteps some of the difficulties of mixing Python and C calls, since | a Python cooperator-function could invoke a cooperator implemented in | C which in turn invokes another Python cooperator- function, etc. The | only requirement is that all the objects on the way down obey the | cooperator protocol. Yes. This is the key, you _only_ attempt to suspend iterators, that is objects that already keep their state in the heap. | The drawback is that you always have to know when you're calling | another cooperator and use 'do'. But that's no worse than the current | situation with generators, where you always have to know you're using | an iterator and use 'for... yield'. Er, well, if the interface was an 'SuspendIteration' then it could be generalized to all iterators. Afterall, while an exception invalidates the stack-frame for the current call to next(), it doesn't prevent the caller from invoking next() once again. If someone calls next() manually, then SuspendIteration would act as a regular exception... and be uncaught. Clark
"Clark C. Evans"
I just read the thread 'Stackless Python' in June 2004 on python-dev and was wondering if you'd comment
Sure
on a simpler cooperative mechanism, via a small hack to generators:
If it is possible, as proposed, maybe at least a medium hack?
1. The PEP would introduce a new 'builtin' class called 'Cooperate'
2. Generator semantics would be altered so that 'yield X', where X is an instance of Cooperate, would automagically propigate to the outer-most non-generator.
-1 Skipping over the generic objections of 'hard to teach to beginners' and 'opens to door to a hundred other such specialized features': 1. This alters and complexifies the current generator concept. At present, 'generator' (the return type of any generator function) is an instance of the iterator concept, as is 'iterator' (the return type of iter()) and any class with appropriate .__iter__ and .next methods. A generator .next method is a resumable function; yield is like return except not permanent. The local namespace is not destroyed and instruction pointer not reset for the next call.
From the outside, a generator .next method is, I believe, indistinguishable in Python (excluding possible implementation-specific voodoo) from an interator-type .next method and behaviorally indistingueshable from any other iterator-protocol parameterless .next method. As far as I know, the difference between a generator and other iterators is ease of writing (sometimes *much* easier) and speed of execution, not external behavior.
2. This imposes an isinstance(x, magic.Collaborate) function call cost on every yield in every generator. (I do not believe the alternative of imposing the same cost on every function call in every generator will work since I do not believe that callers can dependably know whether objects are returned or yielded. See comments on example.) 3. 'automagic' it is. The pertinent runtime call stack contains not generator objects but .next methods (or in CPython, method-wrappers). So I presume you are proposing that yielded Collaborates magically plunk themselves into the first frame up the stack not associated such objects. 4. If the returned to frame does not re-call the passed over suspended frames, they and anything they hold onto are leaked until gc'ed.
>>> def MyCooperate(magic.Cooperate): >>> __str__(self): return "cooperate" >>> >>> def top(): >>> yield "one" >>> yield MyCooperate() >>> yield "two" >>> >>> def middle(): >>> """ intermediate generator _only_ sees one and two """ >>> for x in top(): >>> print "middle", x >>> yield x
Here, the effect you seek could be accomplished by an explicit 'if not isinstance(x, Collaborate):' before the print statement (or, in general, the block of code that uses x). But what if top is rewritten as def top(): return ("one", MyCooperate(), "two") Do you really want the behavior of middle to change, as with your proposal?
>>> def lower(): >>> """ this is not a generator, so it sees cooperate """ >>> for x in middle(): >>> print "lower", x
Now add 'if isinstance(x, Cooperate): return' and the suspended middle will be left in limbo. Not very tidy ;-) Terry J. Reedy
Thanks for responding Terry. I'm not sure if my exposition was clear. On Mon, Aug 23, 2004 at 08:33:09PM -0400, Terry Reedy wrote: | > 1. The PEP would introduce a new 'builtin' class called 'Cooperate' | > | > 2. Generator semantics would be altered so that 'yield X', where X | > is an instance of Cooperate, would automagically propigate to | > the outer-most non-generator. Suppose I have three 'frames', two generators and one function, the call stack looks like: g2 g1 fn For this to occur, fn has called .next() on g1, which has called .next() on g2. The proposed behavior is that "yield magic.Collaborate" provides the value returned by g1.next() within fn. In particular, it would be equivalent as if every "val = g2.next()" in g1 was rewritten: val = g2.next() if isinstance(val, magic.Cooperate): yield val One could do this in three ways: (a) by a convention above, this is done in twisted.flow, (b) by a bytecode hack, or (c) by modifying eval.c to have equivalent behavior. | 1. This alters and complexifies the current generator concept. Correct; it introduces something similar to a corountine. | 2. This imposes an isinstance(x, magic.Collaborate) function call cost on | every yield in every generator. Ok. So, in addition to causing unexpected results (one would have to check to see what sort of object to debug), it would also cause a performance hit. | 3. 'automagic' it is. The pertinent runtime call stack contains not | generator objects but .next methods (or in CPython, method-wrappers). So I | presume you are proposing that yielded Collaborates magically plunk | themselves into the first frame up the stack not associated such objects. Right, the first frame that is not a generator, aka, not having a yield statement. And yes, it is different semantics. | 4. If the returned to frame does not re-call the passed over suspended | frames, they and anything they hold onto are leaked until gc'ed. No. See the above equivalence. It might take a bit of thinking to make the "C" implementation correct, but the semantics are farily straight-forward and should not create leaks. | Here, the effect you seek could be accomplished by an explicit 'if not | isinstance(x, Collaborate):' before the print statement (or, in general, | the block of code that uses x). But what if top is rewritten as | def top(): return ("one", MyCooperate(), "two") | Do you really want the behavior of middle to change, as with your proposal? Yes. | > >>> def lower(): | > >>> """ this is not a generator, so it sees cooperate """ | > >>> for x in middle(): | > >>> print "lower", x | | Now add 'if isinstance(x, Cooperate): return' and the suspended middle will | be left in limbo. Not very tidy ;-) No, it would be just like a generator had 'if 1: return'; the suspended middle would be cleaned up just as it is today. ... However, you've convinced me on the semantic differences and the performance hit. I guess the proposal would have to present a new keyword, say 'collaborate' and have to have its semantics clearly specified. Thank you _so_ much for your kind review, Clark
it would be equivalent as if every "val = g2.next()" in g1 was rewritten:
val = g2.next() if isinstance(val, magic.Cooperate): yield val
Something bothers me about that proposal, and I think it's this: it would mean that generators are no longer simply a particular technique for implementing iterators, but would have special powers that other iterators don't have. Unless recognising the magic Cooperate value and doing the right thing were made part of the iterator protocol, and all iterators were expected to do the right thing with it, which would be a big imposition. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
it would be equivalent as if every "val = g2.next()" in g1 was rewritten:
val = g2.next() if isinstance(val, magic.Cooperate): yield val
Something bothers me about that proposal, and I think it's this: it would mean that generators are no longer simply a particular technique for implementing iterators, but would have special powers that other iterators don't have. Unless recognising the magic Cooperate value and doing the right thing were made part of the iterator protocol, and all iterators were expected to do the right thing with it, which would be a big imposition.
I agree with Greg and the other comments on wanting to keep generators simple. Part of the beauty of generators is the are brain-dead simple to use while still being handy as hell. Playing with them to add the ability to play asynchronous stuff doesn't seem right. I would rather see full-blown coroutines for Python 3000 to deal with this somehow. -Brett
Clark C. Evans wrote: ...
With these two changes, the "lower" function could be an async reactor like those found in Twisted, or async core. While the result isn't true coutines, it would be a huge improvement for those who would like to do async coding. I've done something similar with Twisted called Flow [1] and it works well, with the exception of being a painful syntax hack and being quite slow. If this was moved into Python's core, we'd get most of the advantages of coroutines without the full cost.
Thoughts?
Well, I just think "no". Generators, as limited as they are in Python, make some sense. Coroutines for me have the advantage to make a context switch. While generators are very often called in a context where they even could be inlined, coroutines should be really independent. But they are not independent if you just cannot switch, because one of them just happens to call a different function. A typical use of coroutines is the situation where it cannot be deduced who is caller or callee. What I mean, those situations which can be solved with a stack are the trivial cases, and that is exactly *not* what Stackless is about. ciao - chris -- Christian Tismer :^) mailto:tismer@stackless.com tismerysoft GmbH : Have a break! Take a ride on Python's Carmerstr. 2 : *Starship* http://starship.python.net/ 10623 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 31 86 04 18 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
participants (7)
-
Brett C.
-
Christian Tismer
-
Clark C. Evans
-
Daniel Bickett
-
Greg Ewing
-
Phillip J. Eby
-
Terry Reedy