Revised revised revised PEP on yield-from

Fourth draft of the PEP. Corrected an error in the expansion and added a bit more to the Rationale. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing <greg.ewing@canterbury.ac.nz> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, the subgenerator interacting directly with the main generator's caller for as long as it runs. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from <expr> where <expr> is an expression evaluating to an iterable, from which an iterator is extracted. The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). In detail: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. (If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator.) * If the delegating generator's ``close()`` method is called, the iterator is finalised before finalising the delegating generator. The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _i.throw(_e) else: raise else: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no alternative has so far been proposed, other than ``call``, which has already been rejected by the BDFL. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg

From: "Greg Ewing" <greg.ewing@canterbury.ac.nz>
Are there any use cases that warrant all this complexity? I not yet seen a single piece of real-world code that would benefit from yield-from having pass-throughs for send/throw/close. So far, this seems to have been a purely theoretical exercise what is possible, but it doesn't seem to include investigation as to whether it is actually useful. In the absence of real-world use cases, it might still be helpful to look at some contrived, hypothetical use cases so we can see if the super-powered version actually provides a better solution (is the code more self-evidently correct, is the construct easy to use and understand, is it awkward to use)? The proto-pep seems heavy on specification and light on showing that this is actually something we want to have. Plenty of folks have shown an interest in a basic version of yield-every or yield-from, but prior to this protoPEP, I've never seen any request for or discussion of a version that does pass-throughs for send/throw/close. Raymond

Raymond Hettinger wrote:
Are there any use cases that warrant all this complexity?
Have you read the latest version of the Rationale? I've tried to explain more clearly where I'm coming from. As for real-world use cases, I've seen at least two frameworks that people have come up with for using generators as threads, where you make calls by writing things like result = yield Call(f(x, y)) There is a "driver" at the top that's maintaining a stack of generators and managing all the plumbing, so that you can pretend the above statement is just doing the same as result = f(x, y) except that f is suspendable. So some people *are* actually doing this sort of thing in real life, in a rather ad-hoc way. My proposal would standardise and streamline it, and make it more efficient. It would also free up the values being passed in and out of the yields so you can use them for your own purposes, instead of using them to implement the coroutine machinery. -- Greg

Raymond Hettinger wrote:
What he said. I'm +1 on a basic pass-through "yield from". I understand the motivation in the protoPEP (factoring out parts of a generator into other generators), but it's not clear how genuinely useful this is in practice. I haven't used threads, and the motivating use case doesn't mean anything to me. If I've understood the protoPEP, it wraps four distinct pieces of functionality: "yield from" pass-through pass-through for send pass-through for throw pass-through for close I think each one needs to be justified, or at least explained, individually. I'm afraid I'm not even clear on what pass-through for send/throw/close would even mean, let alone why they would be useful. Basic yield pass-through is obvious, and even if we decide that it's nothing more than syntactic sugar for "for x in gen: yield x", I think it's a clear win for readability. But the rest needs some clear, simple examples of how they would be used. -- Steven

I'd like to understand better what this function would do: def generate_concatenate(generator_list): for g in generator_list: yield from g in particular, what does generator_concatenate.close() do? --- Bruce

Steven D'Aprano wrote:
First of all, to be clear on this, the send, throw and close mechanisms were proposed in PEP 342 and adopted in Python 2.5. For some reason though, these new mechanisms didn't seem to make it into the standard Python documentation. So you'll need to read PEP 342 if you have any question on how these work. This post is on "pass-through for close". I've tried to make these as simple as possible, but there's still a little bit to it, so please bear with me. Let's get started. We're going to do a little loan application program. We're going to process a list of loan applications. Each loan application consists of a list of people. If any of the people on the list qualify, then they get the loan. If none of the people qualify, they don't get the loan. We're going to have a generator that generates the individual names. If the name does not qualify, then DoesntQualify is raised by the caller using the throw method: class DoesntQualify(Exception): pass Names = [['Raymond'], ['Bruce', 'Marilyn'], ['Jack', 'Jill']] def gen(l): count = 0 try: for names in l: count += 1 for name in names: try: yield name break except DoesntQualify: pass else: print names, "don't qualify" finally: print "processed", count, "applications" Now we need a function that gets passed this generator and checks each name to see if it qualifies. I would expect to be able to write: def process(generator): for name in generator: if len(name) > 5: print name, "qualifies" else: raise DoesntQualify But running this gives:
What I expected was the for statement in process would forward the DoesntQualify exception to the generator. But it doesn't do this, so I'm left to do it myself. My next try developing this example, was: def process(generator): for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) But running this gives: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications Traceback (most recent call last): File "throw2.py", line 46, in <module> process2(gen(Names)) File "throw2.py", line 43, in process2 name = iterable.throw(DoesntQualify) StopIteration Oops, the final throw raised StopIteration when it hit the end of Names. So I end up with: def process(generator): try: for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) except StopIteration: pass This one works: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications But by this time, it's probably more clear if I just abandon the for statement entirely: def process(generator): name = generator.next() while True: try: if len(name) > 5: print name, "qualifies" name = generator.next() else: name = generator.throw(DoesntQualify) except StopIteration: break But now I need to change process to add a limit to the number of accepted applications: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break Seems easy enough, except that this is broken again because the final "processed N applications" message won't come out if the limit is hit (unless you are running CPython and call it in such a way that the generator is immediately collected -- but this doesn't work on jython or ironpython). That's what the close method is for, and I forgot to call it: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break generator.close() So what starts out conceptually simple, ends up more complicated and error prone that I had expected; and the reason is that the for statement doesn't support these new generators methods. If it did, I would have: def process(generator, limit): count = 1 for generator as name: # new syntax doesn't break old code if len(name) > 5: print name, "qualifies" count += 1 if count > limit: break else: raise DoesntQualify # new for passes this to generator.throw # new for remembers to call generator.close for me. Now, we need to extend this because there are several lists of applications. I'd like to be able to use the same gen function on each list, and the same process function and just introduce an intermediate generator that gathers up the output of several generators. This is exactly what itertools.chain does! So this should be very easy:
But, nope, itertools.chain doesn't honor the extra generator methods either. If we had yield from, then I could use that instead of itertools.chain: def multi_gen(gen_list): for gen in gen_list: yield from gen When I use yield from, it sets multi_gen aside and lets process talk directly to each generator. So I would expect that not only would objects yielded by each generator be passed directly back to process, but that exceptions passed in by process with throw would be passed directly to the generator. Why would this *not* be the case? With the for statement, I can see that doing the throw/close processing might break some legacy code and understand the reservation in doing so there. But here we have a new language construct where we don't need to worry about legacy code. It's also a construct dealing directly and exclusively with generators. If I can't use yield from, and itertools.chain does work, and the for statement doesn't work, then I'm faced once again with having to code everything again myself: def multi_gen(gen_list): for gen in gen_list: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: gen.close() Yuck! Did I get this one right? Nope, same StopIteration problem with gen.throw... Let's try: def multi_gen(gen_list): for gen in gen_list: try: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: pass finally: gen.close() Even more yuck! This feels more like programming in assembler than python :-( -bruce frederiksen

On Sun, Feb 15, 2009 at 2:41 PM, Bruce Frederiksen <dangyogi@gmail.com> wrote:
I to do several posts, one on each item above in an attempt to demonstrate what we're talking about here.
Thanks for the examples, they gave some good idea of what we're *really* talking about :)
Backwards compatibility is not the (only) issue here. Calling implicitly the extra generator methods is optional at best and non-intuitive at worse. For close() it's usually desirable to be called when a loop exits naturally, although that's debatable for prematurely ended loops; the caller may still have a use for the non-exhausted generator. For throw() however, I strongly disagree that a raise statement in a loop should implicitly call generator.throw(), regardless of what "for" syntax is used. When I read "raise Exception", I expect the control to flow out of the current frame to the caller, not in an unrelated frame of some generator. The only viable option would perhaps be a new statement, say "throw Exception", that distinguishes it clearly from raise.
As I said, I don't think that the for statement can or should be made to "work", but would updating chain(), or all itertools.* for that matter, so that they play well with the new methods solve most real world cases ? If so, that's probably better than adding new syntax, practicality-beats-purity and all that. George

George Sakkis wrote:
Just in case it's not clear, the behaviour being suggested here is *not* part of my proposal. As far as yield-from is concerned, propagation of exceptions into the subgenerator would only occur when throw() was called on the generator containing the yield-from, and then only when it's suspended in the midst of it. Raise statements within the delegating generator have nothing to do with the matter and aren't affected at all. Having some examples to look at is a good idea, but Bruce seems to be going off on a tangent and making some proposals of his own for enhancing the for-loop. I fear that this will only confuse the discussion further. Perhaps I should also point out that yield-from is *not* intended to help things like itertools.chain manage the cleanup of its generators, so examples involving things with chain-like behaviour are probably not going to help clarify what it *is* intended for. It would be nice to have a language feature to help with things like that, but I have no idea at the moment what such a thing would be like. -- Greg

Greg Ewing wrote:
Guilty! I apologize for any side-tracking of the yield from discussion. As people are asking for real world examples and I've done a lot with generators, and I didn't see many other people offering examples, I thought I could offer some. But my code obviously doesn't use yield from, so I'm looking to use of the for statement or itertools.chain, which are the two that would be replaced by yield from. So I'm thinking, on the one hand, that examples where for or chain should forward send/throw/close should transfer to yield from. But I'm also thinking that the same arguments apply to for/chain. OTOH, the desire to use "yield from" for a "poor man's" cooperative threading facility also brings me to think that generators have 3 fatal design flaws that will prevent them from growing into something much more useful (like threading): 1. The double use of send/throw and the yield expression for simultaneous input and output to/from the generator; rather than separating input and output as two different constructs. Sending one value in does not always correspond to getting one value out. 2. The absence of an object (even an implicit one like sys.stdin and sys.stdout are for input and print) representing the target of the yield/throw/send that can be passed on to other functions, allowing them to contribute to the generator's output stream in a much more natural way. * I'm thinking here of a pair of cooperating pipe objects, read and write, and a pair of built-in functions, something like input and print that get and send an object to implicit pipein and pipeout objects (one for each "thread"). These would replace send and yield. * But I think that the iterator interface is very successful, should be kept intact, and is what the read pipe object should look like. 3. The double use of yield to indicate rendezvoused output to the parent "thread", as well as to flag its containing function as one that always starts a new "thread" when executed. * This prevents us from having generator A simply call generator B to have B yield objects for A. In other words, calling B as a normal function that doesn't start another thread would mean that B yields to the current thread's pipeout. While starting B in a new thread with its own pipeout would do what current generators do. Thus generator A would have the option to run B in two ways, as a new generator thread to yield values back to A, or within A's thread as a normal function to yield values to the same place that A yields values to. * I'm thinking that there would be a builtin generate function (or some special syntax) used to run a function in a new thread. Thus generate(gen_b, arg1, arg2, ...) would return a read pipe (which is an iterable) connected to the write pipe for the new thread: for x in generate(gen_b, arg1, arg2, ...): or maybe: for x in gen_b(arg1, arg2, ...)&: or whatever, is different than: gen_b(arg1, arg2, ...) This would accomplish what yield from is trying to do in a more flexible and readable way. So the question in my mind is: do we move towards adopting some new kind of generator/threading capability (and eventually deprecating current generators) that doesn't have these limitations, or do we stick with generators? If we want to stick with the current generators, then I'm in favor of the proposed "yield from" (with the possible exception of the new "return"). But even if we want to more towards a new-style generator capability, "yield from" could be fielded much more quickly than a whole new-style generator capability, so ??? If people are interested in discussing this further, I'm open to that. Otherwise, sorry for the side-tracking... -bruce frederiksen

Bruce Frederiksen wrote:
You might not be interested in sending or receiving a value every time, but you do have to suspend the generator each time you want to send and/or receive a value. Currently, there is only one way to suspend a generator, which for historical reasons is called 'yield'. Each time you use it, you have the opportunity to send a value, and an opportunity to receive a value, but you don't have to use both of these (or either of them) if you don't want to. What you seem to be proposing is having two aliases for 'yield', one of which only sends and the other only receives. Is that right? If so, I don't see much point in it other than making code read slightly better.
* I'm thinking here of a pair of cooperating pipe objects, read and write,
Pipes are different in an important way -- they have queueing. Writes to one end don't have to interleave perfectly with reads at the other. But generators aren't like that -- there is no buffer to hold sent/yielded values until the other end is ready for them. Or are you suggesting that there should be such buffering? I would say that's a higher-level facility that should be provided by library code using yield, or something like it, as a primitive. -- Greg

Greg Ewing wrote: form). This would be replaced by builtin functions. I would propose that the builtins take optional pipe arguments that would default to the current thread's pipein/pipeout. I would also propose that each thread be allowed multiple input and/or output pipes and that the selection of which to use could be done by passing an integer value for the pipe argument. For example: send(obj, pipeout = None) send_from(iterable, pipeout = None) # does what "yield from" is supposed to do next(iterator = None) num_input_pipes() num_output_pipes() You may need a few more functions to round this out: pipein(index = 0) # returns the current thread's pipein[index] object, could also use iter() for this. pipeout(index = 0) # returns the current thread's pipeout[index] object throwforward(exc_type, exc_value = None, traceback = None, pipeout = None) throwback(exc_type, exc_value = None, traceback = None, pipein = None) Thus: yield expr becomes send(expr) which doesn't mean "this is generator" or that control will *necessarily* be transfered to another thread here. It depends on whether the other thread has already done a next on the corresponding pipein. I'm thinking that the C code (byte interpretor) that manages Python stack frame objects become detached from Python stack, so that a Python to Python call does not grow the C stack. This would allow the C code to fork the Python stack and switch between branches quite easily. This separation of input and output would clean up most generator examples. Guido's tree flattener has special code to yield SKIP in response to SKIP, because he doesn't really want a value returned from sending a SKIP in. This would no longer be necessary. def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE # I guess a SKIP can't be returned here? becomes: def __iter__(self): return generate(self.flatten) def flatten(self): send(self.label) if next() != SKIP: send(ENTER) if next() != SKIP: for child in self.children: child.flatten() send(LEAVE) Also, the caller could then simply look like: for token in tree(): if too_deep: send(SKIP) else: send(None) <process token> rather than: response = None gen = tree() try: while True: token = gen.send(response) if too_deep: response = SKIP else: response = None <process token> except StopIteration: pass The reason for this extra complexity is that send returns a value. Separating send from yielding values lets you call send from within for statements without having another value land in your lap that you really would rather have sent to the for statement. The same thing applies to throw. If throw didn't return a value, then it could be easily called within for statements. The parsing example goes from: def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token yield None # to signal EOF def parse_items(closing_tag = None): elems = [] while 1: token = token_stream.next() if not token: break # EOF if is_opening_tag(token): elems.append(parse_elem(token)) elif token == closing_tag: break else: elems.append(token) return elems def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "</%s>" % name items = parse_items(closing_tag) return (name, items) to def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) send(token) def parse_items(closing_tag = None): for token in next(): if is_opening_tag(token): send(parse_elem(token)) elif token == closing_tag: break else: send(token) def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "</%s>" % name items = list(generate(parse_items(closing_tag), pipein=pipein())) return (name, items) and perhaps called as: tree = list(scanner(text) | parse_items()) This also obviates the need to do an initial next call when pushing (sending) to generators which are acting as consumers. A need which is difficult to explain and to understand.
I didn't mean to imply that buffering was required, or even desired. With no buffering, the sender and receiver stay in-sync, just like generators. A write would suspend until a matching read, and vice versa. Only when the pipe sees both a write and a read would the object be transfered from the writer to the reader. Thus, write/read replaces yield as the way to suspend the current "thread". This avoids the confusion about whether we're "pushing" or "pulling" to/from a generator. For example, itertools.tee is currently designed as a generator that "pulls" values from its iterable parameter. But then it can't switch roles to "push" values to its consumers, and so must be prepared to store values in case the consumers aren't synchronized with each other. With this new approach, the consumer waiting for the send value would be activated by the pipe connecting it to tee. And if that consumer wasn't ready for a value yet, tee would be suspended until it was. So tee would not have to store any values. def tee(): num_outputs = num_output_pipes() for input in next(): for i in range(num_outputs): send(input, i) Does this help? -bruce frederiksen

Bruce Frederiksen wrote:
All this is higher-level stuff that can be built on the primitive operation of yielding. For instance it could easily be added to the scheduling library I'm about to post (I tried to post it yesterday, but it bounced). -- Greg

Steven D'Aprano wrote:
I want to write a function that plays "guess this number" by making successive guesses and getting a high/low response. My first version will generate random guesses: def rand_guesser(limit): lo = 0 # answer is > lo hi = limit + 1 # answer is < hi num_tries = 0 while lo + 2 < hi: guess = random.randint(lo + 1, hi - 1) num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "rand_guesser: got", guess, "in", num_tries, "tries" and then the function that calls it: def test(guesser, limit): n = random.randint(1, limit) print "the secret number is", n try: guess = guesser.next() while True: print "got", guess guess = guesser.send(cmp(guess, n)) except StopIteration: pass # guesser.close() isn't necessary if we got StopIteration, # because the generator has already finalized.
So far, so good. But how does binary_search compare with random_guesser? def binary_search(limit): lo = 0 hi = limit + 1 num_tries = 0 while lo + 2 < hi: guess = (hi + lo) // 2 num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "binary_search: got", guess, "in", num_tries, "tries"
Hmmm, but compare these, I need to run them on the same answer number. I know, I can just chain them together. Then after test will just see both sets of guesses back to back... Another obvious choice for itertools.chain!
Oops, that's right, itertools.chain doesn't play nicely with advanced generators... :-( So I guess I have to write my own intermediate multi_guesser... Luckily, we have yield from! def multi_guesser(l, limit): for gen in l: yield from gen(limit) What does yield from do? It sets multi_guesser aside so that test can communicate directly with each gen. Objects yielded by the gen go directly back to test. And I would expect that objects sent from test (with send) would go directly to the gen. If that's the case, this works fine! If not, then I'm sad again and have to do something like: def multi_guesser(l, limit): for gen in l: g = gen(limit) try: guess = g.next() while True: guess = g.send((yield guess)) except StopIteration: pass Which one do you think is more pythonic? Which one would you rather get stuck maintaining? (Personally, I'd vote for itertools.chain!)
-bruce frederiksen

On Fri, Feb 13, 2009 at 5:31 PM, Raymond Hettinger <python@rcn.com> wrote:
While I haven't read the PEP thouroughly, I believe I understand the concept of pass-through and I think I have a compelling use case, at least for passing through .send(). The rest then shouldn't be a problem. Let's also not forget that 99% of all uses of generators don't involve .send(), .throw() or .close(). My use case is flattening of trees, for example parse trees. For concreteness, assume a node has a label and a list of children. The iteration should receive ENTER and LEAVE pseudo-labels when entering a level. We can then write a pre-order iterator like this, using yield-from without caring about pass-through: def __iter__(self): yield self.label if self.children: yield ENTER for child in self.children: yield from child yield LEAVE Now suppose the caller of the iteration wants to be able to occasionally truncate the traversal, e.g. it may not be interested in the subtree for certain labels, or it may want to skip very deep trees. It's not possible to anticipate what the caller is wants to truncate, so we don't want to build direct support for e.g. skip-lists or level-control into the iterator. Instead, the caller now uses .send(SKIP) when it wants to skip a subtree. The iterator responds with a SKIPPED pseudo-label. For example: def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE I believe the pass-through semantics proposed for yield-from are *exactly* what we need in this case. Without it, the for-loop would have to be written like this: for child in self.children: it = iter(child) while True: try: value = it.send(skip) except StopIteration: break skip = yield value Other remarks: (a) I don't know if the PEP proposes that "yield from expr" should return the last value returned by (i.e. sent to) a yield somewhere deeply nested; I think this would be useful. (b) I hope the PEP also explains what to do if "expr" is not a generator but some other kind of iterator. IMO it should work as long as .send() etc. are not used. I think it would probably be safest to raise an exception is .send() is used and the receiving iterator is not a generator. For .throw() and .close() it would probably be most useful to let them have their effect in the last generator on the stack. (c) A quick skim of the PEP didn't show suggestions for how to implement this. I think this needs to be addressed. I don't think it will be possible to literally replace the outer generator with the inner one while that is running; the treatment of StopIteration probably requires some kind of chaining, so that there is still a cost associated with deeply nested yield-from clauses. However it could be much more efficient than explicit for-loops. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
No, it doesn't. Its value is the value passed to the 'return' statement that terminates the subgenerator (and generators are enhanced to permit return with a value). My reason for doing this is so you can use subgenerators like functions in a generator that's being used as a lightweight thread.
(b) I hope the PEP also explains what to do if "expr" is not a generator but some other kind of iterator.
Yes, it does. Currently I'm proposing that, if the relevant methods are not defined, send() is treated like next(), and throw() and close() do what they would have done normally on the parent generator.
(c) A quick skim of the PEP didn't show suggestions for how to implement this.
One way would be to simply emit the bytecode corresponding to the presented expansion, although that wouldn't be very efficient in terms of either speed or code size. The PEP also sketches an optimised implementation in which the generator has a slot which refers to the generator being delegated to. Calls to next(), send(), throw() and close() are forwarded via this slot if it is nonempty. There will still be a small overhead involved in the delegation, but it's only a chain of C function calls instead of Python ones, which ought to be a big improvement. It might be possible to reduce the overhead even further by following the chain of delegation pointers in a loop until reaching the end and then calling the end generator directly. It would be trickier to get right, though, because you'd have to be prepared to back up and try earlier generators in the face of StopIterations. -- Greg

[Greg Ewing]
Looks like a language construct where only a handful of python programmers will be able to correctly describe what it does. I've only seen requests for the functionality in the first bullet point. The rest seems like unnecessary complexity -- something that will take a page in the docs rather than a couple lines.
This seems like it is awkwardly trying to cater to two competing needs. It recognized that the outer generator make have a legitimate need to catch an exception and that the inner generator might want it too. Unfortunately, only one can be caught and there is no way to have both the inner and outer generator/iterator each do their part in servicing an exception. Also, am concerned that this slows down the more common case of _i not having a throw method. The same thoughts apply to send() and close(). Potentially, both an inner and outer generator will need a close function but will have no way of doing both. Raymond

On Mon, Feb 16, 2009 at 6:51 PM, Raymond Hettinger <python@rcn.com> wrote:
That doesn't necessarily matter. It's true for quite a few Python constructs that many Python programmers use without knowing every little semantic detail. If you don't use .send(), .throw(), .close(), the semantics of "yield from" are very simple to explain and remember. All the rest is there to make advanced uses possible. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Raymond Hettinger wrote:
The whole area of generators is one where I think only a minority of programmers will ever fully understand all the gory details. Those paragraphs are there for the purpose of providing a complete and rigorous specification of what's being proposed. For most people, almost all the essential information is summed up in this one sentence near the top: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression. As I've said, I believe it's actually quite simple conceptually. It's just that it gets messy trying to explain it by way of expansion into currently existing Python code and concepts.
I don't understand what you mean by that. If you were making an ordinary function call, you'd expect that the called function would get first try at catching any exception occurring while it's running, and if it doesn't, it propagates out to the calling function. Also it's not true that only one of them can catch the exception. The inner one might catch it, do some processing and then re-raise it. Or it might do something in a finally block. My intent is for all these things to work the same way when one generator delegates to another using yield-from. -- Greg

From: "Greg Ewing" <greg.ewing@canterbury.ac.nz>
Are there any use cases that warrant all this complexity? I not yet seen a single piece of real-world code that would benefit from yield-from having pass-throughs for send/throw/close. So far, this seems to have been a purely theoretical exercise what is possible, but it doesn't seem to include investigation as to whether it is actually useful. In the absence of real-world use cases, it might still be helpful to look at some contrived, hypothetical use cases so we can see if the super-powered version actually provides a better solution (is the code more self-evidently correct, is the construct easy to use and understand, is it awkward to use)? The proto-pep seems heavy on specification and light on showing that this is actually something we want to have. Plenty of folks have shown an interest in a basic version of yield-every or yield-from, but prior to this protoPEP, I've never seen any request for or discussion of a version that does pass-throughs for send/throw/close. Raymond

Raymond Hettinger wrote:
Are there any use cases that warrant all this complexity?
Have you read the latest version of the Rationale? I've tried to explain more clearly where I'm coming from. As for real-world use cases, I've seen at least two frameworks that people have come up with for using generators as threads, where you make calls by writing things like result = yield Call(f(x, y)) There is a "driver" at the top that's maintaining a stack of generators and managing all the plumbing, so that you can pretend the above statement is just doing the same as result = f(x, y) except that f is suspendable. So some people *are* actually doing this sort of thing in real life, in a rather ad-hoc way. My proposal would standardise and streamline it, and make it more efficient. It would also free up the values being passed in and out of the yields so you can use them for your own purposes, instead of using them to implement the coroutine machinery. -- Greg

Raymond Hettinger wrote:
What he said. I'm +1 on a basic pass-through "yield from". I understand the motivation in the protoPEP (factoring out parts of a generator into other generators), but it's not clear how genuinely useful this is in practice. I haven't used threads, and the motivating use case doesn't mean anything to me. If I've understood the protoPEP, it wraps four distinct pieces of functionality: "yield from" pass-through pass-through for send pass-through for throw pass-through for close I think each one needs to be justified, or at least explained, individually. I'm afraid I'm not even clear on what pass-through for send/throw/close would even mean, let alone why they would be useful. Basic yield pass-through is obvious, and even if we decide that it's nothing more than syntactic sugar for "for x in gen: yield x", I think it's a clear win for readability. But the rest needs some clear, simple examples of how they would be used. -- Steven

I'd like to understand better what this function would do: def generate_concatenate(generator_list): for g in generator_list: yield from g in particular, what does generator_concatenate.close() do? --- Bruce

Steven D'Aprano wrote:
First of all, to be clear on this, the send, throw and close mechanisms were proposed in PEP 342 and adopted in Python 2.5. For some reason though, these new mechanisms didn't seem to make it into the standard Python documentation. So you'll need to read PEP 342 if you have any question on how these work. This post is on "pass-through for close". I've tried to make these as simple as possible, but there's still a little bit to it, so please bear with me. Let's get started. We're going to do a little loan application program. We're going to process a list of loan applications. Each loan application consists of a list of people. If any of the people on the list qualify, then they get the loan. If none of the people qualify, they don't get the loan. We're going to have a generator that generates the individual names. If the name does not qualify, then DoesntQualify is raised by the caller using the throw method: class DoesntQualify(Exception): pass Names = [['Raymond'], ['Bruce', 'Marilyn'], ['Jack', 'Jill']] def gen(l): count = 0 try: for names in l: count += 1 for name in names: try: yield name break except DoesntQualify: pass else: print names, "don't qualify" finally: print "processed", count, "applications" Now we need a function that gets passed this generator and checks each name to see if it qualifies. I would expect to be able to write: def process(generator): for name in generator: if len(name) > 5: print name, "qualifies" else: raise DoesntQualify But running this gives:
What I expected was the for statement in process would forward the DoesntQualify exception to the generator. But it doesn't do this, so I'm left to do it myself. My next try developing this example, was: def process(generator): for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) But running this gives: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications Traceback (most recent call last): File "throw2.py", line 46, in <module> process2(gen(Names)) File "throw2.py", line 43, in process2 name = iterable.throw(DoesntQualify) StopIteration Oops, the final throw raised StopIteration when it hit the end of Names. So I end up with: def process(generator): try: for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) except StopIteration: pass This one works: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications But by this time, it's probably more clear if I just abandon the for statement entirely: def process(generator): name = generator.next() while True: try: if len(name) > 5: print name, "qualifies" name = generator.next() else: name = generator.throw(DoesntQualify) except StopIteration: break But now I need to change process to add a limit to the number of accepted applications: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break Seems easy enough, except that this is broken again because the final "processed N applications" message won't come out if the limit is hit (unless you are running CPython and call it in such a way that the generator is immediately collected -- but this doesn't work on jython or ironpython). That's what the close method is for, and I forgot to call it: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break generator.close() So what starts out conceptually simple, ends up more complicated and error prone that I had expected; and the reason is that the for statement doesn't support these new generators methods. If it did, I would have: def process(generator, limit): count = 1 for generator as name: # new syntax doesn't break old code if len(name) > 5: print name, "qualifies" count += 1 if count > limit: break else: raise DoesntQualify # new for passes this to generator.throw # new for remembers to call generator.close for me. Now, we need to extend this because there are several lists of applications. I'd like to be able to use the same gen function on each list, and the same process function and just introduce an intermediate generator that gathers up the output of several generators. This is exactly what itertools.chain does! So this should be very easy:
But, nope, itertools.chain doesn't honor the extra generator methods either. If we had yield from, then I could use that instead of itertools.chain: def multi_gen(gen_list): for gen in gen_list: yield from gen When I use yield from, it sets multi_gen aside and lets process talk directly to each generator. So I would expect that not only would objects yielded by each generator be passed directly back to process, but that exceptions passed in by process with throw would be passed directly to the generator. Why would this *not* be the case? With the for statement, I can see that doing the throw/close processing might break some legacy code and understand the reservation in doing so there. But here we have a new language construct where we don't need to worry about legacy code. It's also a construct dealing directly and exclusively with generators. If I can't use yield from, and itertools.chain does work, and the for statement doesn't work, then I'm faced once again with having to code everything again myself: def multi_gen(gen_list): for gen in gen_list: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: gen.close() Yuck! Did I get this one right? Nope, same StopIteration problem with gen.throw... Let's try: def multi_gen(gen_list): for gen in gen_list: try: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: pass finally: gen.close() Even more yuck! This feels more like programming in assembler than python :-( -bruce frederiksen

On Sun, Feb 15, 2009 at 2:41 PM, Bruce Frederiksen <dangyogi@gmail.com> wrote:
I to do several posts, one on each item above in an attempt to demonstrate what we're talking about here.
Thanks for the examples, they gave some good idea of what we're *really* talking about :)
Backwards compatibility is not the (only) issue here. Calling implicitly the extra generator methods is optional at best and non-intuitive at worse. For close() it's usually desirable to be called when a loop exits naturally, although that's debatable for prematurely ended loops; the caller may still have a use for the non-exhausted generator. For throw() however, I strongly disagree that a raise statement in a loop should implicitly call generator.throw(), regardless of what "for" syntax is used. When I read "raise Exception", I expect the control to flow out of the current frame to the caller, not in an unrelated frame of some generator. The only viable option would perhaps be a new statement, say "throw Exception", that distinguishes it clearly from raise.
As I said, I don't think that the for statement can or should be made to "work", but would updating chain(), or all itertools.* for that matter, so that they play well with the new methods solve most real world cases ? If so, that's probably better than adding new syntax, practicality-beats-purity and all that. George

George Sakkis wrote:
Just in case it's not clear, the behaviour being suggested here is *not* part of my proposal. As far as yield-from is concerned, propagation of exceptions into the subgenerator would only occur when throw() was called on the generator containing the yield-from, and then only when it's suspended in the midst of it. Raise statements within the delegating generator have nothing to do with the matter and aren't affected at all. Having some examples to look at is a good idea, but Bruce seems to be going off on a tangent and making some proposals of his own for enhancing the for-loop. I fear that this will only confuse the discussion further. Perhaps I should also point out that yield-from is *not* intended to help things like itertools.chain manage the cleanup of its generators, so examples involving things with chain-like behaviour are probably not going to help clarify what it *is* intended for. It would be nice to have a language feature to help with things like that, but I have no idea at the moment what such a thing would be like. -- Greg

Greg Ewing wrote:
Guilty! I apologize for any side-tracking of the yield from discussion. As people are asking for real world examples and I've done a lot with generators, and I didn't see many other people offering examples, I thought I could offer some. But my code obviously doesn't use yield from, so I'm looking to use of the for statement or itertools.chain, which are the two that would be replaced by yield from. So I'm thinking, on the one hand, that examples where for or chain should forward send/throw/close should transfer to yield from. But I'm also thinking that the same arguments apply to for/chain. OTOH, the desire to use "yield from" for a "poor man's" cooperative threading facility also brings me to think that generators have 3 fatal design flaws that will prevent them from growing into something much more useful (like threading): 1. The double use of send/throw and the yield expression for simultaneous input and output to/from the generator; rather than separating input and output as two different constructs. Sending one value in does not always correspond to getting one value out. 2. The absence of an object (even an implicit one like sys.stdin and sys.stdout are for input and print) representing the target of the yield/throw/send that can be passed on to other functions, allowing them to contribute to the generator's output stream in a much more natural way. * I'm thinking here of a pair of cooperating pipe objects, read and write, and a pair of built-in functions, something like input and print that get and send an object to implicit pipein and pipeout objects (one for each "thread"). These would replace send and yield. * But I think that the iterator interface is very successful, should be kept intact, and is what the read pipe object should look like. 3. The double use of yield to indicate rendezvoused output to the parent "thread", as well as to flag its containing function as one that always starts a new "thread" when executed. * This prevents us from having generator A simply call generator B to have B yield objects for A. In other words, calling B as a normal function that doesn't start another thread would mean that B yields to the current thread's pipeout. While starting B in a new thread with its own pipeout would do what current generators do. Thus generator A would have the option to run B in two ways, as a new generator thread to yield values back to A, or within A's thread as a normal function to yield values to the same place that A yields values to. * I'm thinking that there would be a builtin generate function (or some special syntax) used to run a function in a new thread. Thus generate(gen_b, arg1, arg2, ...) would return a read pipe (which is an iterable) connected to the write pipe for the new thread: for x in generate(gen_b, arg1, arg2, ...): or maybe: for x in gen_b(arg1, arg2, ...)&: or whatever, is different than: gen_b(arg1, arg2, ...) This would accomplish what yield from is trying to do in a more flexible and readable way. So the question in my mind is: do we move towards adopting some new kind of generator/threading capability (and eventually deprecating current generators) that doesn't have these limitations, or do we stick with generators? If we want to stick with the current generators, then I'm in favor of the proposed "yield from" (with the possible exception of the new "return"). But even if we want to more towards a new-style generator capability, "yield from" could be fielded much more quickly than a whole new-style generator capability, so ??? If people are interested in discussing this further, I'm open to that. Otherwise, sorry for the side-tracking... -bruce frederiksen

Bruce Frederiksen wrote:
You might not be interested in sending or receiving a value every time, but you do have to suspend the generator each time you want to send and/or receive a value. Currently, there is only one way to suspend a generator, which for historical reasons is called 'yield'. Each time you use it, you have the opportunity to send a value, and an opportunity to receive a value, but you don't have to use both of these (or either of them) if you don't want to. What you seem to be proposing is having two aliases for 'yield', one of which only sends and the other only receives. Is that right? If so, I don't see much point in it other than making code read slightly better.
* I'm thinking here of a pair of cooperating pipe objects, read and write,
Pipes are different in an important way -- they have queueing. Writes to one end don't have to interleave perfectly with reads at the other. But generators aren't like that -- there is no buffer to hold sent/yielded values until the other end is ready for them. Or are you suggesting that there should be such buffering? I would say that's a higher-level facility that should be provided by library code using yield, or something like it, as a primitive. -- Greg

Greg Ewing wrote: form). This would be replaced by builtin functions. I would propose that the builtins take optional pipe arguments that would default to the current thread's pipein/pipeout. I would also propose that each thread be allowed multiple input and/or output pipes and that the selection of which to use could be done by passing an integer value for the pipe argument. For example: send(obj, pipeout = None) send_from(iterable, pipeout = None) # does what "yield from" is supposed to do next(iterator = None) num_input_pipes() num_output_pipes() You may need a few more functions to round this out: pipein(index = 0) # returns the current thread's pipein[index] object, could also use iter() for this. pipeout(index = 0) # returns the current thread's pipeout[index] object throwforward(exc_type, exc_value = None, traceback = None, pipeout = None) throwback(exc_type, exc_value = None, traceback = None, pipein = None) Thus: yield expr becomes send(expr) which doesn't mean "this is generator" or that control will *necessarily* be transfered to another thread here. It depends on whether the other thread has already done a next on the corresponding pipein. I'm thinking that the C code (byte interpretor) that manages Python stack frame objects become detached from Python stack, so that a Python to Python call does not grow the C stack. This would allow the C code to fork the Python stack and switch between branches quite easily. This separation of input and output would clean up most generator examples. Guido's tree flattener has special code to yield SKIP in response to SKIP, because he doesn't really want a value returned from sending a SKIP in. This would no longer be necessary. def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE # I guess a SKIP can't be returned here? becomes: def __iter__(self): return generate(self.flatten) def flatten(self): send(self.label) if next() != SKIP: send(ENTER) if next() != SKIP: for child in self.children: child.flatten() send(LEAVE) Also, the caller could then simply look like: for token in tree(): if too_deep: send(SKIP) else: send(None) <process token> rather than: response = None gen = tree() try: while True: token = gen.send(response) if too_deep: response = SKIP else: response = None <process token> except StopIteration: pass The reason for this extra complexity is that send returns a value. Separating send from yielding values lets you call send from within for statements without having another value land in your lap that you really would rather have sent to the for statement. The same thing applies to throw. If throw didn't return a value, then it could be easily called within for statements. The parsing example goes from: def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token yield None # to signal EOF def parse_items(closing_tag = None): elems = [] while 1: token = token_stream.next() if not token: break # EOF if is_opening_tag(token): elems.append(parse_elem(token)) elif token == closing_tag: break else: elems.append(token) return elems def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "</%s>" % name items = parse_items(closing_tag) return (name, items) to def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) send(token) def parse_items(closing_tag = None): for token in next(): if is_opening_tag(token): send(parse_elem(token)) elif token == closing_tag: break else: send(token) def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "</%s>" % name items = list(generate(parse_items(closing_tag), pipein=pipein())) return (name, items) and perhaps called as: tree = list(scanner(text) | parse_items()) This also obviates the need to do an initial next call when pushing (sending) to generators which are acting as consumers. A need which is difficult to explain and to understand.
I didn't mean to imply that buffering was required, or even desired. With no buffering, the sender and receiver stay in-sync, just like generators. A write would suspend until a matching read, and vice versa. Only when the pipe sees both a write and a read would the object be transfered from the writer to the reader. Thus, write/read replaces yield as the way to suspend the current "thread". This avoids the confusion about whether we're "pushing" or "pulling" to/from a generator. For example, itertools.tee is currently designed as a generator that "pulls" values from its iterable parameter. But then it can't switch roles to "push" values to its consumers, and so must be prepared to store values in case the consumers aren't synchronized with each other. With this new approach, the consumer waiting for the send value would be activated by the pipe connecting it to tee. And if that consumer wasn't ready for a value yet, tee would be suspended until it was. So tee would not have to store any values. def tee(): num_outputs = num_output_pipes() for input in next(): for i in range(num_outputs): send(input, i) Does this help? -bruce frederiksen

Bruce Frederiksen wrote:
All this is higher-level stuff that can be built on the primitive operation of yielding. For instance it could easily be added to the scheduling library I'm about to post (I tried to post it yesterday, but it bounced). -- Greg

Steven D'Aprano wrote:
I want to write a function that plays "guess this number" by making successive guesses and getting a high/low response. My first version will generate random guesses: def rand_guesser(limit): lo = 0 # answer is > lo hi = limit + 1 # answer is < hi num_tries = 0 while lo + 2 < hi: guess = random.randint(lo + 1, hi - 1) num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "rand_guesser: got", guess, "in", num_tries, "tries" and then the function that calls it: def test(guesser, limit): n = random.randint(1, limit) print "the secret number is", n try: guess = guesser.next() while True: print "got", guess guess = guesser.send(cmp(guess, n)) except StopIteration: pass # guesser.close() isn't necessary if we got StopIteration, # because the generator has already finalized.
So far, so good. But how does binary_search compare with random_guesser? def binary_search(limit): lo = 0 hi = limit + 1 num_tries = 0 while lo + 2 < hi: guess = (hi + lo) // 2 num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "binary_search: got", guess, "in", num_tries, "tries"
Hmmm, but compare these, I need to run them on the same answer number. I know, I can just chain them together. Then after test will just see both sets of guesses back to back... Another obvious choice for itertools.chain!
Oops, that's right, itertools.chain doesn't play nicely with advanced generators... :-( So I guess I have to write my own intermediate multi_guesser... Luckily, we have yield from! def multi_guesser(l, limit): for gen in l: yield from gen(limit) What does yield from do? It sets multi_guesser aside so that test can communicate directly with each gen. Objects yielded by the gen go directly back to test. And I would expect that objects sent from test (with send) would go directly to the gen. If that's the case, this works fine! If not, then I'm sad again and have to do something like: def multi_guesser(l, limit): for gen in l: g = gen(limit) try: guess = g.next() while True: guess = g.send((yield guess)) except StopIteration: pass Which one do you think is more pythonic? Which one would you rather get stuck maintaining? (Personally, I'd vote for itertools.chain!)
-bruce frederiksen

On Fri, Feb 13, 2009 at 5:31 PM, Raymond Hettinger <python@rcn.com> wrote:
While I haven't read the PEP thouroughly, I believe I understand the concept of pass-through and I think I have a compelling use case, at least for passing through .send(). The rest then shouldn't be a problem. Let's also not forget that 99% of all uses of generators don't involve .send(), .throw() or .close(). My use case is flattening of trees, for example parse trees. For concreteness, assume a node has a label and a list of children. The iteration should receive ENTER and LEAVE pseudo-labels when entering a level. We can then write a pre-order iterator like this, using yield-from without caring about pass-through: def __iter__(self): yield self.label if self.children: yield ENTER for child in self.children: yield from child yield LEAVE Now suppose the caller of the iteration wants to be able to occasionally truncate the traversal, e.g. it may not be interested in the subtree for certain labels, or it may want to skip very deep trees. It's not possible to anticipate what the caller is wants to truncate, so we don't want to build direct support for e.g. skip-lists or level-control into the iterator. Instead, the caller now uses .send(SKIP) when it wants to skip a subtree. The iterator responds with a SKIPPED pseudo-label. For example: def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE I believe the pass-through semantics proposed for yield-from are *exactly* what we need in this case. Without it, the for-loop would have to be written like this: for child in self.children: it = iter(child) while True: try: value = it.send(skip) except StopIteration: break skip = yield value Other remarks: (a) I don't know if the PEP proposes that "yield from expr" should return the last value returned by (i.e. sent to) a yield somewhere deeply nested; I think this would be useful. (b) I hope the PEP also explains what to do if "expr" is not a generator but some other kind of iterator. IMO it should work as long as .send() etc. are not used. I think it would probably be safest to raise an exception is .send() is used and the receiving iterator is not a generator. For .throw() and .close() it would probably be most useful to let them have their effect in the last generator on the stack. (c) A quick skim of the PEP didn't show suggestions for how to implement this. I think this needs to be addressed. I don't think it will be possible to literally replace the outer generator with the inner one while that is running; the treatment of StopIteration probably requires some kind of chaining, so that there is still a cost associated with deeply nested yield-from clauses. However it could be much more efficient than explicit for-loops. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
No, it doesn't. Its value is the value passed to the 'return' statement that terminates the subgenerator (and generators are enhanced to permit return with a value). My reason for doing this is so you can use subgenerators like functions in a generator that's being used as a lightweight thread.
(b) I hope the PEP also explains what to do if "expr" is not a generator but some other kind of iterator.
Yes, it does. Currently I'm proposing that, if the relevant methods are not defined, send() is treated like next(), and throw() and close() do what they would have done normally on the parent generator.
(c) A quick skim of the PEP didn't show suggestions for how to implement this.
One way would be to simply emit the bytecode corresponding to the presented expansion, although that wouldn't be very efficient in terms of either speed or code size. The PEP also sketches an optimised implementation in which the generator has a slot which refers to the generator being delegated to. Calls to next(), send(), throw() and close() are forwarded via this slot if it is nonempty. There will still be a small overhead involved in the delegation, but it's only a chain of C function calls instead of Python ones, which ought to be a big improvement. It might be possible to reduce the overhead even further by following the chain of delegation pointers in a loop until reaching the end and then calling the end generator directly. It would be trickier to get right, though, because you'd have to be prepared to back up and try earlier generators in the face of StopIterations. -- Greg

[Greg Ewing]
Looks like a language construct where only a handful of python programmers will be able to correctly describe what it does. I've only seen requests for the functionality in the first bullet point. The rest seems like unnecessary complexity -- something that will take a page in the docs rather than a couple lines.
This seems like it is awkwardly trying to cater to two competing needs. It recognized that the outer generator make have a legitimate need to catch an exception and that the inner generator might want it too. Unfortunately, only one can be caught and there is no way to have both the inner and outer generator/iterator each do their part in servicing an exception. Also, am concerned that this slows down the more common case of _i not having a throw method. The same thoughts apply to send() and close(). Potentially, both an inner and outer generator will need a close function but will have no way of doing both. Raymond

On Mon, Feb 16, 2009 at 6:51 PM, Raymond Hettinger <python@rcn.com> wrote:
That doesn't necessarily matter. It's true for quite a few Python constructs that many Python programmers use without knowing every little semantic detail. If you don't use .send(), .throw(), .close(), the semantics of "yield from" are very simple to explain and remember. All the rest is there to make advanced uses possible. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Raymond Hettinger wrote:
The whole area of generators is one where I think only a minority of programmers will ever fully understand all the gory details. Those paragraphs are there for the purpose of providing a complete and rigorous specification of what's being proposed. For most people, almost all the essential information is summed up in this one sentence near the top: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression. As I've said, I believe it's actually quite simple conceptually. It's just that it gets messy trying to explain it by way of expansion into currently existing Python code and concepts.
I don't understand what you mean by that. If you were making an ordinary function call, you'd expect that the called function would get first try at catching any exception occurring while it's running, and if it doesn't, it propagates out to the calling function. Also it's not true that only one of them can catch the exception. The inner one might catch it, do some processing and then re-raise it. Or it might do something in a finally block. My intent is for all these things to work the same way when one generator delegates to another using yield-from. -- Greg
participants (7)
-
Bruce Frederiksen
-
Bruce Leban
-
George Sakkis
-
Greg Ewing
-
Guido van Rossum
-
Raymond Hettinger
-
Steven D'Aprano