[Python-Dev] Minimal 'stackless' PEP using generators?
Phillip J. Eby
pje at telecommunity.com
Mon Aug 23 19:18:28 CEST 2004
At 12:34 PM 8/23/04 -0400, Clark C. Evans wrote:
>On Mon, Aug 23, 2004 at 11:56:04AM -0400, Phillip J. Eby wrote:
>| It doesn't seem to me to actually help anything. You can already do this
>| using a simple wrapper object that maintains a stack of active
>| generators, as I do in 'peak.events'.
>
>Could you provide an example? The problem this proposal solves is
>straight-foward -- it is tedious and slow to have intermediate
>generators do stuff like:
>
> def middle():
> """ intermediate generator _only_ sees one and two """
> for x in top():
>! if isinstance(x,X):
>! yield x
> print "middle", x
> yield x
>
>This extra step is tedious and also slow; especially if one has lots of
>yield statements that cooperate.
'peak.events' uses "Task" objects that maintain a stack of active
generators. The Task receives yields from the "innermost" generator
directly, without them being passed through by intermediate generators. If
the value yielded is *not* a control value, the Task object pops the
generator stack, and resumes the previously suspended generator. A "magic"
function, 'events.resume()' retrieves the value from the Task inside the
stopped generator.
Basically, this mechanism doesn't pass control values through multiple
tests and generator frames: control values are consumed immediately by the
Task. This makes it easy to suspend nested generators while waiting for
some event, such as socket readability, a timeout, a Twisted "Deferred",
etc. Yielding an "event" object like one of the aforementioned items
causes the Task to return to its caller (the event loop) after requesting a
callback for the appropriate event. When the callback re-invokes the
thread, it saves the value associated with the event, if any, for
'events.resume()' to retrieve when the topmost generator is resumed.
Also, 'events.resume()' supports passing errors from one generator to the
next, so that it's "as if" the generators execute in a nested fashion. The
drawback is that you must invoke events.resume() after each yield, but this
is *much* less intrusive than requiring generators to pass through results
from all nested generators. Take a look at:
http://cvs.eby-sarna.com/PEAK/src/peak/events/
In particular, the 'interfaces' and 'event_threads' modules. Here's a
usage example, a simple Task procedure:
@events.taskFactory
def monitorBusy(self):
# get a "readable" event on this socket
untilReadable = self.eventLoop.readable(self)
while True:
# Wait until we have stream activity
yield untilReadable; events.resume()
# Is everybody busy?
if self.busyCount()==self.childCount():
self.supervisor.requestStart()
# Wait until the child or busy count changes before proceeding
yield events.AnyOf(self.busyCount,self.childCount);
events.resume()
This task waits until a listener socket is readable (i.e. an incoming
connection is pending), and then asks the process supervisor to start more
processes if all the child processes are busy. It then waits until either
the busy count or the child process count changes, before it waits for
another incoming connection.
Basically, if you're invoking a sub-generator, you do:
yield subGenerator(arguments); result=events.resume()
This is if you're calling a sub-generator that only returns one "real"
result. You needn't worry about passing through control values, because
the current generator won't be resumed until the subgenerator yields a
non-control value.
If you're invoking a sub-generator that you intend to *iterate over*,
however, and that generator can suspend on events, it's a bit more complex:
iterator = subGenerator(arguments)
while True:
yield iterator; nextItem = events.resume()
if nextItem is NOT_GIVEN: # sentinel value
break
# body of loop goes here, using 'nextItem'
This is not very convenient, but I don't find it all that common to have
data I'm iterating over in such a fashion, because 'peak.events' programs
tends to have "infinite" streams that are organized as event sources in
"pipe and filter" fashion. So, you tend to end up with Tasks that only
have one generator running anyway, except for things that are more like
"subroutines" than real generators, because you only expect one real return
value from them, anyway.
peak.events can work with Twisted, by the way, if you have it
installed. For example, this:
yield aDeferred; result = events.resume()
suspends the generator until the Deferred fires, and then the result will
be placed in 'result' upon resumption of the generator. If the Deferred
triggers an "errback", the call to 'events.resume()' will reraise the
error, inside the current generator.
It would be nice if there were some way to "accept" data and exceptions
within a generator that didn't require the 'events.resume' hack, e.g.:
result = yield aDeferred
would be really nice, especially if 'result' could cause an exception to be
raised. I was hoping that this was something along the lines of what you
were proposing. E.g. if generator-iterators could take arguments to
'next()' that would let you do this. I believe there's already a rejected
PEP covering the issue of communicating "into" generators.
Perhaps there should be a "simple coroutines" PEP, that doesn't try to
extend generators into coroutines, but instead treats coroutines as a
first-class animal that just happens to be implemented using some of the
same techniques "under the hood".
More information about the Python-Dev
mailing list