PEP 334 - Simple Coroutines via SuspendIteration
I've packaged up the idea of a coroutine facility using iterators and an exception, SuspendIteration. This would require some rather deep changes to how generators are implemented, however, it seems backwards compatible, implementable /w JVM or CLR, and would make most of my database/web development work far more pleasant. http://www.python.org/peps/pep-0334.html Cheers! Clark ... PEP: 334 Title: Simple Coroutines via SuspendIteration Version: $Revision: 1.1 $ Last-Modified: $Date: 2004/09/08 00:11:18 $ Author: Clark C. Evans <info@clarkevans.com> Status: Draft Type: Standards Track Python-Version: 3.0 Content-Type: text/x-rst Created: 26-Aug-2004 Post-History: Abstract ======== Asynchronous application frameworks such as Twisted [1]_ and Peak [2]_, are based on a cooperative multitasking via event queues or deferred execution. While this approach to application development does not involve threads and thus avoids a whole class of problems [3]_, it creates a different sort of programming challenge. When an I/O operation would block, a user request must suspend so that other requests can proceed. The concept of a coroutine [4]_ promises to help the application developer grapple with this state management difficulty. This PEP proposes a limited approach to coroutines based on an extension to the iterator protocol [5]_. Currently, an iterator may raise a StopIteration exception to indicate that it is done producing values. This proposal adds another exception to this protocol, SuspendIteration, which indicates that the given iterator may have more values to produce, but is unable to do so at this time. Rationale ========= There are two current approaches to bringing co-routines to Python. Christian Tismer's Stackless [6]_ involves a ground-up restructuring of Python's execution model by hacking the 'C' stack. While this approach works, its operation is hard to describe and keep portable. A related approach is to compile Python code to Parrot [7]_, a register-based virtual machine, which has coroutines. Unfortunately, neither of these solutions is portable with IronPython (CLR) or Jython (JavaVM). It is thought that a more limited approach, based on iterators, could provide a coroutine facility to application programmers and still be portable across runtimes. * Iterators keep their state in local variables that are not on the "C" stack. Iterators can be viewed as classes, with state stored in member variables that are persistent across calls to its next() method. * While an uncaught exception may terminate a function's execution, an uncaught exception need not invalidate an iterator. The proposed exception, SuspendIteration, uses this feature. In other words, just because one call to next() results in an exception does not necessarily need to imply that the iterator itself is no longer capable of producing values. There are four places where this new exception impacts: * The simple generator [8]_ mechanism could be extended to safely 'catch' this SuspendIteration exception, stuff away its current state, and pass the exception on to the caller. * Various iterator filters [9]_ in the standard library, such as itertools.izip should be made aware of this exception so that it can transparently propagate SuspendIteration. * Iterators generated from I/O operations, such as a file or socket reader, could be modified to have a non-blocking variety. This option would raise a subclass of SuspendIteration if the requested operation would block. * The asyncore library could be updated to provide a basic 'runner' that pulls from an iterator; if the SuspendIteration exception is caught, then it moves on to the next iterator in its runlist [10]_. External frameworks like Twisted would provide alternative implementations, perhaps based on FreeBSD's kqueue or Linux's epoll. While these may seem dramatic changes, it is a very small amount of work compared with the utility provided by continuations. Semantics ========= This section will explain, at a high level, how the introduction of this new SuspendIteration exception would behave. Simple Iterators ---------------- The current functionality of iterators is best seen with a simple example which produces two values 'one' and 'two'. :: class States: def __iter__(self): self._next = self.state_one return self def next(self): return self._next() def state_one(self): self._next = self.state_two return "one" def state_two(self): self._next = self.state_stop return "two" def state_stop(self): raise StopIteration print list(States()) An equivalent iteration could, of course, be created by the following generator:: def States(): yield 'one' yield 'two' print list(States()) Introducing SuspendIteration ---------------------------- Suppose that between producing 'one' and 'two', the generator above could block on a socket read. In this case, we would want to raise SuspendIteration to signal that the iterator is not done producing, but is unable to provide a value at the current moment. :: from random import randint from time import sleep class SuspendIteration(Exception): pass class NonBlockingResource: """Randomly unable to produce the second value""" def __iter__(self): self._next = self.state_one return self def next(self): return self._next() def state_one(self): self._next = self.state_suspend return "one" def state_suspend(self): rand = randint(1,10) if 2 == rand: self._next = self.state_two return self.state_two() raise SuspendIteration() def state_two(self): self._next = self.state_stop return "two" def state_stop(self): raise StopIteration def sleeplist(iterator, timeout = .1): """ Do other things (e.g. sleep) while resource is unable to provide the next value """ it = iter(iterator) retval = [] while True: try: retval.append(it.next()) except SuspendIteration: sleep(timeout) continue except StopIteration: break return retval print sleeplist(NonBlockingResource()) In a real-world situation, the NonBlockingResource would be a file iterator, socket handle, or other I/O based producer. The sleeplist would instead be an async reactor, such as those found in asyncore or Twisted. The non-blocking resource could, of course, be written as a generator:: def NonBlockingResource(): yield "one" while True: rand = randint(1,10) if 2 == rand: break raise SuspendIteration() yield "two" It is not necessary to add a keyword, 'suspend', since most real content generators will not be in application code, they will be in low-level I/O based operations. Since most programmers need not be exposed to the SuspendIteration() mechanism, a keyword is not needed. Application Iterators --------------------- The previous example is rather contrived, a more 'real-world' example would be a web page generator which yields HTML content, and pulls from a database. Note that this is an example of neither the 'producer' nor the 'consumer', but rather of a filter. :: def ListAlbums(cursor): cursor.execute("SELECT title, artist FROM album") yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>' for (title, artist) in cursor: yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) yield '</table></body></html>' The problem, of course, is that the database may block for some time before any rows are returned, and that during execution, rows may be returned in blocks of 10 or 100 at a time. Ideally, if the database blocks for the next set of rows, another user connection could be serviced. Note the complete absence of SuspendIterator in the above code. If done correctly, application developers would be able to focus on functionality rather than concurrency issues. The iterator created by the above generator should do the magic necessary to maintain state, yet pass the exception through to a lower-level async framework. Here is an example of what the corresponding iterator would look like if coded up as a class:: class ListAlbums: def __init__(self, cursor): self.cursor = cursor def __iter__(self): self.cursor.execute("SELECT title, artist FROM album") self._iter = iter(self._cursor) self._next = self.state_head return self def next(self): return self._next() def state_head(self): self._next = self.state_cursor return "<html><body><table><tr><td>\ Title</td><td>Artist</td></tr>" def state_tail(self): self._next = self.state_stop return "</table></body></html>" def state_cursor(self): try: (title,artist) = self._iter.next() return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) except StopIteration: self._next = self.state_tail return self.next() except SuspendIteration: # just pass-through raise def state_stop(self): raise StopIteration Complicating Factors -------------------- While the above example is straight-forward, things are a bit more complicated if the intermediate generator 'condenses' values, that is, it pulls in two or more values for each value it produces. For example, :: def pair(iterLeft,iterRight): rhs = iter(iterRight) lhs = iter(iterLeft) while True: yield (rhs.next(), lhs.next()) In this case, the corresponding iterator behavior has to be a bit more subtle to handle the case of either the right or left iterator raising SuspendIteration. It seems to be a matter of decomposing the generator to recognize intermediate states where a SuspendIterator exception from the producing context could happen. :: class pair: def __init__(self, iterLeft, iterRight): self.iterLeft = iterLeft self.iterRight = iterRight def __iter__(self): self.rhs = iter(iterRight) self.lhs = iter(iterLeft) self._temp_rhs = None self._temp_lhs = None self._next = self.state_rhs return self def next(self): return self._next() def state_rhs(self): self._temp_rhs = self.rhs.next() self._next = self.state_lhs return self.next() def state_lhs(self): self._temp_lhs = self.lhs.next() self._next = self.state_pair return self.next() def state_pair(self): self._next = self.state_rhs return (self._temp_rhs, self._temp_lhs) This proposal assumes that a corresponding iterator written using this class-based method is possible for existing generators. The challenge seems to be the identification of distinct states within the generator where suspension could occur. Resource Cleanup ---------------- The current generator mechanism has a strange interaction with exceptions where a 'yield' statement is not allowed within a try/finally block. The SuspendIterator exception provides another similar issue. The impacts of this issue are not clear. However it may be that re-writing the generator into a state machine, as the previous section did, could resolve this issue allowing for the situation to be no-worse than, and perhaps even removing the yield/finally situation. More investigation is needed in this area. API and Limitations ------------------- This proposal only covers 'suspending' a chain of iterators, and does not cover (of course) suspending general functions, methods, or "C" extension function. While there could be no direct support for creating generators in "C" code, native "C" iterators which comply with the SuspendIterator semantics are certainly possible. Low-Level Implementation ======================== The author of the PEP is not yet familiar with the Python execution model to comment in this area. References ========== .. [1] Twisted (http://twistedmatrix.com) .. [2] Peak (http://peak.telecommunity.com) .. [3] C10K (http://www.kegel.com/c10k.html) .. [4] Coroutines (http://c2.com/cgi/wiki?CallWithCurrentContinuation) .. [5] PEP 234, Iterators (http://www.python.org/peps/pep-0234.html) .. [6] Stackless Python (http://stackless.com) .. [7] Parrot /w coroutines (http://www.sidhe.org/~dan/blog/archives/000178.html) .. [8] PEP 255, Simple Generators (http://www.python.org/peps/pep-0255.html) .. [9] itertools - Functions creating iterators (http://docs.python.org/lib/module-itertools.html) .. [10] Microthreads in Python, David Mertz (http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
Josiah Carlson kindly pointed out (off list), that my use of SuspendIteration violates the standard idiom of exceptions terminating the current function. This got past me, beacuse I think a generator not as a function, but rather as a shortcut to creating iterators. The offending code is, | def NonBlockingResource(): | yield "one" | while True: | rand = randint(1,10) | if 2 == rand: | break | raise SuspendIteration() | yield "two" There are two solutions: (a) introduce a new keyword 'suspend'; or, (b) don't do that. It is not essential to the proposal that the generator syntax produce iterators that can SuspendIteration, it is only essential that the implementation of generators pass-through this exception. Most non-blocking resources will be low-level components from an async database or socket library; they can make iterators the old way. Cheers, Clark
Clark C. Evans wrote:
Josiah Carlson kindly pointed out (off list), that my use of SuspendIteration violates the standard idiom of exceptions terminating the current function. This got past me, beacuse I think a generator not as a function, but rather as a shortcut to creating iterators. The offending code is,
| def NonBlockingResource(): | yield "one" | while True: | rand = randint(1,10) | if 2 == rand: | break | raise SuspendIteration() | yield "two"
There are two solutions: (a) introduce a new keyword 'suspend'; or, (b) don't do that.
It is not essential to the proposal that the generator syntax produce iterators that can SuspendIteration, it is only essential that the implementation of generators pass-through this exception. Most non-blocking resources will be low-level components from an async database or socket library; they can make iterators the old way.
What about this? def somefunc(): raise SuspendIteration() return 'foo' def genfunc(): yield somefunc() Jp
On Wed, Sep 08, 2004 at 09:07:59AM -0400, Jp Calderone wrote: | | def somefunc(): | raise SuspendIteration() | return 'foo' | | def genfunc(): | yield somefunc() Interesting, but: - somefunc is a function, thus SuspendIteration() should terminate the function; raising an exception - somefunc is not a generator, so it cannot be yielded. However, perhaps something like... def suspend(*args,**kwargs): raise SuspendIteration(*args,**kwargs) # never ever returns def myProducer(): yeild "one" suspend() yield "two" Regardless, this is a side point. The authors of iterators that raise a SuspendIterator() will be low-level code, like a next() which reads the next block from a socket or row from a database query. In these cases, the class style iterator is sufficient. The real point, is that user-level generators, such as this example from the PEP (which is detailed as a class-based iterator), should transparently handle SuspendIteration() by passing it up the generator chain without killing the current scope. | def ListAlbums(cursor): | cursor.execute("SELECT title, artist FROM album") | yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>' | for (title, artist) in cursor: | yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) | yield '</table></body></html>' For those who say that this iterator should be invalidated when cursor.next() raises SuspendIteration(), I point out that it is not invalided when cursor.next() raises StopIteration(). Kind Regards, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
On Wed, Sep 08, 2004 at 09:26:03AM -0400, Clark C. Evans wrote: | On Wed, Sep 08, 2004 at 09:07:59AM -0400, Jp Calderone wrote: | | | | def somefunc(): | | raise SuspendIteration() | | return 'foo' | | | | def genfunc(): | | yield somefunc() | | Interesting, but: | - somefunc is a function, thus SuspendIteration() should | terminate the function; raising an exception | - somefunc is not a generator, so it cannot be yielded. It's too early for me to be posting; scrap the nonsense in this second point. I don't think this changes the suggestion below though. | | However, perhaps something like... | | def suspend(*args,**kwargs): | raise SuspendIteration(*args,**kwargs) | # never ever returns | | def myProducer(): | yeild "one" | suspend() | yield "two" | | Regardless, this is a side point. The authors of iterators that | raise a SuspendIterator() will be low-level code, like a next() | which reads the next block from a socket or row from a database | query. In these cases, the class style iterator is sufficient. | | The real point, is that user-level generators, such as this example | from the PEP (which is detailed as a class-based iterator), should | transparently handle SuspendIteration() by passing it up the generator | chain without killing the current scope. | | | def ListAlbums(cursor): | | cursor.execute("SELECT title, artist FROM album") | | yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>' | | for (title, artist) in cursor: | | yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) | | yield '</table></body></html>' | | For those who say that this iterator should be invalidated when | cursor.next() raises SuspendIteration(), I point out that it is not | invalided when cursor.next() raises StopIteration(). | | Kind Regards, | | Clark | | | -- | Clark C. Evans Prometheus Research, LLC. | http://www.prometheusresearch.com/ | o office: +1.203.777.2550 | ~/ , mobile: +1.203.444.0557 | // | (( Prometheus Research: Transforming Data Into Knowledge | \\ , | \/ - Research Exchange Database | /\ - Survey & Assessment Technologies | ` \ - Software Tools for Researchers | ~ * | _______________________________________________ | Python-Dev mailing list | Python-Dev@python.org | http://mail.python.org/mailman/listinfo/python-dev | Unsubscribe: http://mail.python.org/mailman/options/python-dev/cce%40clarkevans.com -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
On Sep 7, 2004, at 9:48 PM, Clark C. Evans wrote:
I've packaged up the idea of a coroutine facility using iterators and an exception, SuspendIteration.
Very interesting.
This proposal assumes that a corresponding iterator written using this class-based method is possible for existing generators. The challenge seems to be the identification of distinct states within the generator where suspension could occur.
That is basically impossible. Essentially *every* operation could possibly raise SuspendIteration, because essentially every operation can call an arbitrary python function, and python functions can raise any exception they want. I think you could still make the proposal work in CPython: if I understand its internals properly, it doesn't need to do a transformation to a class iterator, it simply suspends the frame wherever it is. Thus, being able to suspend at any point in the function would not cause an undue performance degradation. However, I think it is a deal-breaker for JPython. From the generator PEP: "It's also believed that efficient implementation in Jython requires that the compiler be able to determine potential suspension points at compile-time, and a new keyword makes that easy." If this quote is right about the implementation of Jython (and it seems likely, given the JVM), your proposal makes it impossible to implement generators in Jython. Given that the advantage claimed for this proposal over stackless is that it can be implemented in non-CPython runtimes, I think it still needs some reworking. James
On Wed, Sep 08, 2004 at 02:08:17PM -0400, James Y Knight wrote: | >This proposal assumes that a corresponding iterator written using | >this class-based method is possible for existing generators. The | >challenge seems to be the identification of distinct states within | >the generator where suspension could occur. | | That is basically impossible. Essentially *every* operation could | possibly raise SuspendIteration, because essentially every operation | can call an arbitrary python function, and python functions can raise | any exception they want. If the SuspendIteration() was raised in an arbitrary Python function, it would close-out the function call due to exception semantics. So, a brain-dead situation would have to make each time a function is called a separate state. The proposal is not implying that this would be converting arbitrary functions into generators, if they happened to raise SuspendIteration(). | I think you could still make the proposal work | in CPython: if I understand its internals properly, it doesn't need to | do a transformation to a class iterator, it simply suspends the frame | wherever it is. Thus, being able to suspend at any point in the | function would not cause an undue performance degradation. Ok. | However, I think it is a deal-breaker for JPython. From the generator | PEP: "It's also believed that efficient implementation in Jython | requires that the compiler be able to determine potential suspension | points at compile-time, and a new keyword makes that easy." If this | quote is right about the implementation of Jython (and it seems likely, | given the JVM), your proposal makes it impossible to implement | generators in Jython. Ok, beacuse suspension points would now include not only 'yield' statements, but potentially any function call. So, it could be quite inefficient, but it is not impossible. For an optimization, you could decorate a function if it could throw a SuspendIteration. If an non-decorated function threw that exception, it would be a deal-breaker. | Given that the advantage claimed for this proposal over stackless is | that it can be implemented in non-CPython runtimes, I think it still | needs some reworking. Thanks for your feedback. Best, Clark
James Y Knight wrote:
On Sep 7, 2004, at 9:48 PM, Clark C. Evans wrote:
I've packaged up the idea of a coroutine facility using iterators and an exception, SuspendIteration.
Very interesting.
This proposal assumes that a corresponding iterator written using this class-based method is possible for existing generators. The challenge seems to be the identification of distinct states within the generator where suspension could occur.
That is basically impossible. Essentially *every* operation could possibly raise SuspendIteration, because essentially every operation can call an arbitrary python function, and python functions can raise any exception they want. I think you could still make the proposal work in CPython: if I understand its internals properly, it doesn't need to do a transformation to a class iterator, it simply suspends the frame wherever it is. Thus, being able to suspend at any point in the function would not cause an undue performance degradation.
I don't think it is that simple for CPython either, a single bytecode can potentially invoke more then just a single builtin or other python code, e.g. calling construction can invoke __new__ and __init__, and then there are all the cases were descriptors are involved with their __get__ etc (and __add__,__radd__...). So bytecodes are not the right suspension/resumption granularity because you don't want to reinvoke things that could have had side-effects. So you have all the points per bytecode were python code/builtins can be invoked or from another POV an exception can be detected. If I understand the proposal (which is quite vague), like restartable syscalls, there is also the matter that whatever raised the SuspendIteration should be retried on resumption of the generator, e.g calling nested generator next. So one would have to cherry pick for each bytecode or similar abstract operations model relevant suspension/resumption points and it would still be quite a daunting task to implement this adding the code for intra bytecode resumption. (Of course this assumes that capturing the C stack and similar techniques are out of question)
However, I think it is a deal-breaker for JPython. From the generator PEP: "It's also believed that efficient implementation in Jython requires that the compiler be able to determine potential suspension points at compile-time, and a new keyword makes that easy." If this quote is right about the implementation of Jython (and it seems likely, given the JVM), your proposal makes it impossible to implement generators in Jython.
a hand coded implementation would be a *lot* of work (beyound practical) for potentially very bad performance and a resulting messy codebase. One could also encounter resulting code size problems or issues with the verifier.
On Wed, Sep 08, 2004 at 08:58:21PM +0200, Samuele Pedroni wrote: | If I understand the proposal (which is quite vague), like restartable | syscalls, there is also the matter that whatever raised the | SuspendIteration should be retried on resumption of the generator, e.g | calling nested generator next. That's exactly the idea. The SuspendIteration exception could contain, however, the file/socket that it is blocked on, so a smart scheduler need not be blindly restarting things. | So one would have to cherry pick for each bytecode or similar abstract | operations model relevant suspension/resumption points and it would | still be quite a daunting task to implement this adding the code | for intra bytecode resumption. (Of course this assumes that capturing | the C stack and similar techniques are out of question) I was assuming that only calls within the generator to next(), implicit or otherwise, would be suspension points. This covers all of my use cases anyway. In the other situations, if they are even useful, don't do that. Convert the SuspendIteration to a RuntimeError, or resume at the previous suspension point? The idea of the PEP was that a nested-generator context provides this limited set of suspension points to make an implementation possible. Kind Regards, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
Clark C. Evans wrote:
On Wed, Sep 08, 2004 at 08:58:21PM +0200, Samuele Pedroni wrote: | If I understand the proposal (which is quite vague), like restartable | syscalls, there is also the matter that whatever raised the | SuspendIteration should be retried on resumption of the generator, e.g | calling nested generator next.
That's exactly the idea. The SuspendIteration exception could contain, however, the file/socket that it is blocked on, so a smart scheduler need not be blindly restarting things.
| So one would have to cherry pick for each bytecode or similar abstract | operations model relevant suspension/resumption points and it would | still be quite a daunting task to implement this adding the code | for intra bytecode resumption. (Of course this assumes that capturing | the C stack and similar techniques are out of question)
I was assuming that only calls within the generator to next(), implicit or otherwise, would be suspension points.
I missed that.
This covers all of my use cases anyway. In the other situations, if they are even useful, don't do that. Convert the SuspendIteration to a RuntimeError, or resume at the previous suspension point?
The idea of the PEP was that a nested-generator context provides this limited set of suspension points to make an implementation possible.
then the PEP needs clarification because I had the impression that def g(src): data = src.read() yield data data = src.read() yield data the read itself could throw a SuspendIteration, and upon the sucessive next the src.read() itself would be retried. But if it's only nexts than can be suspension points then the generator would be not resumable in this case. Which is the case?
On Wed, Sep 08, 2004 at 09:33:10PM +0200, Samuele Pedroni wrote: | Clark C. Evans wrote: | >I was assuming that only calls within the generator to next(), implicit | >or otherwise, would be suspension points. | | I missed that. *nod* I will fix the PEP. | >This covers all of my use cases anyway. In the other situations, if | >they are even useful, don't do that. Convert the SuspendIteration to a | >RuntimeError, or resume at the previous suspension point? | > | >The idea of the PEP was that a nested-generator context provides this | >limited set of suspension points to make an implementation possible. | | then the PEP needs clarification because I had the impression that | | def g(src): | data = src.read() | yield data | data = src.read() | yield data The data producers would all be iterators, ones that that could possibly raise SuspendIteration() from within their next() method. | the read itself could throw a SuspendIteration If read() did raise a SuspendIteration() exception, then it would make sense to terminate the generator, perhaps with a RuntimeError. I just hadn't considered this case. If someone has a clever solution that makes this case work, great, but its not something that I was contemplating. | and upon the sucessive | next the src.read() itself would be retried. | But if it's only nexts than | can be suspension points then the generator would be not resumable in | this case. Right. I was musing (but it's not in the PEP) that, iter() would sprout an option that let the producer know if it can suspend. If a generator that was itself called with this suspend flag asked for a child generator, then the suspend flag would be carried. But this is a separate issue. Thanks for thinking about this PEP. Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ *
Clark C. Evans wrote:
On Wed, Sep 08, 2004 at 09:33:10PM +0200, Samuele Pedroni wrote: | Clark C. Evans wrote: | >I was assuming that only calls within the generator to next(), implicit | >or otherwise, would be suspension points. | | I missed that.
*nod* I will fix the PEP.
| >This covers all of my use cases anyway. In the other situations, if | >they are even useful, don't do that. Convert the SuspendIteration to a | >RuntimeError, or resume at the previous suspension point? | > | >The idea of the PEP was that a nested-generator context provides this | >limited set of suspension points to make an implementation possible. | | then the PEP needs clarification because I had the impression that | | def g(src): | data = src.read() | yield data | data = src.read() | yield data
The data producers would all be iterators, ones that that could possibly raise SuspendIteration() from within their next() method.
| the read itself could throw a SuspendIteration
If read() did raise a SuspendIteration() exception, then it would make sense to terminate the generator, perhaps with a RuntimeError. I just hadn't considered this case. If someone has a clever solution that makes this case work, great, but its not something that I was contemplating.
thinking about it, but this is not different: def g(src): data = src.next() yield data data = src.next() yield data def g(src): demand = src.next data = demand() yield data data = demand() yield data what is supposed to happen here, notice that you may know that src.next is an iterator 'next' at runtime but not at compile time.
Hi, I agree with Samuele that the proposal is far too vague currently. You should try to describe what precisely should occur in each situation. A major problem I see with the proposal is that you can describe what should occur in some situations by presenting source code snippets; such descriptions correspond easily to possible semantics at the bytecode level. But bytecode is not a natural granularity for coroutine issues. Frames (either of generators or functions) execute operations that may invoke new frames, and all frames in the chain except possibly the most recent one need to be suspended *during* the execution of their current bytecode. For example, a generator f() may currently be calling a generator g() with a FOR_ITER bytecode ('for' statement), a CALL_FUNCTION (calling next()), or actually anything else like a BINARY_ADD which calls a nb_add implemented in C which indirectly calls back to Python code. For this reason it is not reasonably possible to implement restartable exceptions in general: when an exception is caught, not all the C state is saved (i.e. you don't know where, *within* the execution of a bytecode, you should restart). Your PEP is very similar to restartable exceptions: their possible semantics are difficult to specify in general. You may try to do that to understand what I mean. This doesn't mean that it is impossible to figure out a more limited concept, like you are trying to do. However keeping the "restartable exception" idea in mind should help focusing on the difficult problems and where restrictions are needed. I think that Stackless contains all the solutions in this area, and I'm not talking about the C stack hacking. Stackless is sometimes able to switch coroutines without hacking at the C stack. I think that if any coroutine support is ever going to be added to CPython it will be done in a similar way. (Generators were also inspired by Stackless, BTW.) (Also note that although the generator syntax is nice and helpful, it would have been possible to write generators without any custom 'yield' syntax if we had restartable exceptions; this makes the latter idea more general and independent from generators.) A bientôt, Armin.
Armin, On Thu, Sep 09, 2004 at 11:14:44AM +0100, Armin Rigo wrote: | I agree with Samuele that the proposal is far too vague currently. You | should try to describe what precisely should occur in each situation. Oh, absolutely. This was a draft PEP to collect feedback. It will be a bit before I have a chunk of time to assimilate the comments and produce another (more detailed) draft. Your comments were very helpful, I've got a bit of education in my future. | A major problem I see with the proposal is that you can describe what | should occur in some situations by presenting source code snippets; such | descriptions correspond easily to possible semantics at the bytecode | level. But bytecode is not a natural granularity for coroutine issues. *nod* | This doesn't mean that it is impossible to figure out a more limited | concept, like you are trying to do. However keeping the "restartable | exception" idea in mind should help focusing on the difficult problems | and where restrictions are needed. Best, Clark
To distill this request to a sentence: I would like syntax-level support in Python for a Continuation Passing Style (CPS) of code execution. It is important to note that Ruby, Parrot (next-generation Perl), and SML-NJ all support this async programming style. In Python land, the Twisted framework uses this style via its Deferred mechanism. This isn't a off-the-wall request. I currently think that a generator syntax would be the best, and this proposal is for further work via defining a SuspendIterator semantics. However, I'm not tied to this implementation. A pre-parser which made Deferred object handling nicer could also work, or any other option that provides an intuitive syntax for CPS in Python. The hoops that Twisted has to jump-through to wrap Exceptions for use in a Deferred processing chain, and also the (completely necessary but yet) convoluted ways of combining Deferreds is, IMHO, a direct result of lack of support for CPS in Python. These items have a huge impact application program readability and maintenance. Clean syntax-level support for CPS in Python would be a boon for application developers. Best, Clark
It is important to note that Ruby, Parrot (next-generation Perl), and SML-NJ all support this async programming style. In Python
For those of us who aren't current on the latest happenings of Ruby, Parrot and SML/NJ, it may be convenient for us to hear precisely how "async programming style" is done in those languages, so we have a reference, and so that we can agree (or disagree) with you about whether they are equivalent to your PEP. It would also be nice if you were to do a bit of research on the internals of those languages, to discover how it is actually implemented. This would allow Python interpreter hackers to say, "Yes, that kind of thing is possible," "Maybe with a bit of work," "It is not possible with the current interpreter," or even "It wouldn't be usable on Jython." With that said, I believe there is a general consensus that this kind of thing would be useful. For me, if I had greenlets everywhere I'd be happy (though I understand that this may not be technically possible on Jython). - Josiah
Josiah Carlson wrote:
It is important to note that Ruby, Parrot (next-generation Perl), and SML-NJ all support this async programming style. In Python
For those of us who aren't current on the latest happenings of Ruby, Parrot and SML/NJ, it may be convenient for us to hear precisely how "async programming style" is done in those languages,
Some clarifications WRT Parrot. Parrot isn't a language, Parrot isn't "next-generation Perl". Parrot is a virtual machine that will run Perl6. And Parrot is running currently languages like Python, tcl, m4, forth, and others more or less completely[1]. Parrot's function calling scheme is CPS. A Python generator function gets automatically translated to a coroutine. Returning from a plain function is done by invoking a continuation. And you can of course (in Parrot assembly) create a continuation store it away and invoke it at any time later, which will continue program execution at that point, where it should continue. Please note that that has nothing to do with "aync programming". Its just like a GOTO, but w/o limitation where you'll branch to - or almost no limitations: you can't cross C-stack boundaries on in other words you can't branch to other incarnations of the run-loop. (Exceptions are a bit more flexible though, but they still can only jump "up" the C-stack) Using CPS for function calls implies therefore a non-trivial rewrite of CPython, which OTOH and AFAIK is already available as Stackless Python. Making continuations usable at the language level is a different thing, though. leo [1] http://www.parrotcode.org - in CVS languages/python. The test b2.py from the Pie-thon benchmark has two generators (izip, Pi.__iter__), which are Parrot coroutines, that's working fine.
At 09:30 PM 9/30/04 +0200, Leopold Toetsch wrote:
Please note that that has nothing to do with "aync programming". Its just like a GOTO, but w/o limitation where you'll branch to - or almost no limitations: you can't cross C-stack boundaries on in other words you can't branch to other incarnations of the run-loop. (Exceptions are a bit more flexible though, but they still can only jump "up" the C-stack)
Using CPS for function calls implies therefore a non-trivial rewrite of CPython, which OTOH and AFAIK is already available as Stackless Python.
Clark is talking about a limited subset of CPS, where continuations are only single-use. That is, a very limited form of continuations roughly equivalent in power to either Greenlets or a stack of generator-iterators.
Making continuations usable at the language level is a different thing, though.
Indeed, and luckily it isn't needed for PEP 334. PEP 334 just needs the interpreter to be able to resume evaluation of a generator frame at any CALL opcode or "for" looping that invokes a generator-iterator's next() method, if SuspendIteration was raised. I don't know if a corresponding operation for Jython is possible. (In the case of CPython, this could be implemented via a type slot to check whether a callable object is "resumable", so that you actually *could* decorate suitable functions as being resumable, not just generator-iterator next() methods.) Personally, I'm +0 (at most) on the PEP at the moment, as it doesn't IMO add much over using a generator stack, such as what I use in 'peak.events'. I'd be much more interested in a way to pass values and exceptions *into* generators, which would be more in line with what I'd consider "simple coroutines". A mechanism to pass values or exceptions into generators would be let me replace the hackish bits of 'peak.events' with clean language features, but I'm not sure PEP 334 would give me enough to be worth reorganizing my code, as it's presently defined. Also, I find the current PEP a confusing mishmash of references to various technologies (that are all harder to implement than what's actually desires) and unmotivating implementations of things I'd can't see wanting to do. It would be helpful for it to focus on motivating usage examples (such as suspending a report while waiting for a database) *early* in the PEP, rather than burying them at the end. And, most of the sample Python code looks to me like examples of how an implementation might work, but they don't illustrate the intended semantics well, nor do they really help with designing an implementation. Finally, the PEP shouldn't call these co-routines, as co-routines are able to "return" values to other co-routines. The title should be something more like "Resuming Generators after SuspendIteration", which much more accurately describes the scope of the desired result.
On Thu, 30 Sep 2004, Phillip J. Eby wrote: ...
A mechanism to pass values or exceptions into generators
[ Possibly somewhat off topic, and apologies if it is, and I'm positive someone's done something similar before, but I think it's relevant to the discussion in hand -- largely because the above use case *doesn't* require changes to python... ] A minimal(ish) example of a technique for doing this I presented last week at a streaming workshop looked like the following. (amongst other stuff...) Create a decorator that wraps the generator inside a class, derived from a supplied base class. (default to object) import copy def wrapgenerator(bases=object, **attrs): def decorate(func): class statefulgenerator(bases): __doc__ = func.__doc__ def __init__(self,*args): super(statefulgenerator, self).__init__(*args) self.func=func(self,*args) for k in attrs.keys(): self.__dict__[k] = copy.deepcopy(attrs[k]) self.next=self.__iter__().next def __iter__(self): return iter(self.func) return statefulgenerator return decorate Create a class to handle the behaviour you wish to use to communicate with the generator from outside: class component(object): def __init__(self, *args): # Default queues self.queues = {"inbox":[],"control":[],"outbox":[],"signal":[]} def send(self, box, object): self.queues[box].append(object) def dataReady(self,box): return len(self.queues[box])>0 def recv(self, box): # NB. Exceptions aren't caught X=self.queues[box][0] del self.queues[box][0] return X Then just use something like this: @wrapgenerator(component) def forwarderNay(self): "Simple data forwarding generator" while 1: if self.dataReady("inbox"): self.send("outbox",self.recv("inbox")+"Nay") elif self.dataReady("control"): if self.recv("control") == "shutdown": break yield 1 self.send("signal","shutdown") yield 0 In conjuction with a simplistic scheduler, and linkage functions this allows you to have something similar to CSP. I've come to the conclusion recently that the fact you can't* yield across multiple levels is actually beneficial because it encourages you to use many small components. Used standalone you can do this though: f=forwarderNay() for word in ["hello", "world", "test"]: f.send("inbox", word) f.next() print f.recv("outbox"), print Which of course outputs: * helloNay worldNay testNay For small scale things this amount of cruft is a bit of a pain. We're using *essentially* this approach to build network servers and it seems a rather satisfying way of doing so TBH. (not exactly this approach so if this looks odd, that's why - I'm not allowed to release the source for precisely what we're doing :-( but the above I was on the slides I *was* able to talk about... :-) Hope that's of some use... Best Regards, Michael.
participants (9)
-
Armin Rigo -
Clark C. Evans -
James Y Knight -
Josiah Carlson -
Jp Calderone -
Leopold Toetsch -
Michael Sparks -
Phillip J. Eby -
Samuele Pedroni