Possible PEP 380 tweak
[Changed subject]
On 2010-10-25 04:37, Guido van Rossum wrote:
This should not require threads.
Here's a bare-bones sketch using generators: [...]
On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm
If you don't care about allowing the funcs to raise StopIteration, this can actually be simplified to: [...]
Indeed, I realized this after posting. :-) I had several other ideas for improvements, e.g. being able to pass an initial value to the reduce-like function or even being able to supply a reduce-like function of one's own.
More interesting (to me at least) is that this is an excellent example of why I would like to see a version of PEP380 where "close" on a generator can return a value (AFAICT the version of PEP380 on http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not mention this possibility, or even link to the heated discussion we had on python-ideas around march/april 2009).
Can you dig up the link here? I recall that discussion but I don't recall a clear conclusion coming from it -- just heated debate. Based on my example I have to agree that returning a value from close() would be nice. There is a little detail, how multiple arguments to StopIteration should be interpreted, but that's not so important if it's being raised by a return statement.
Assuming that "close" on a reduce_collector generator instance returns the value of the StopIteration raised by the "return" statements, we can simplify the code even further:
def reduce_collector(func): try: outcome = yield except GeneratorExit: return None while True: try: val = yield except GeneratorExit: return outcome outcome = func(outcome, val)
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for coll in collectors: next(coll) for val in iterable: for coll in collectors: coll.send(val) return [coll.close() for coll in collectors]
Yes, this is only saving a few lines, but I find it *much* more readable...
I totally agree that not having to call throw() and catch whatever it bounces back is much nicer. (Now I wish there was a way to avoid the "try..except GeneratorExit" construct in the generator, but I think I should stop while I'm ahead. :-) The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate: - the need to "prime" the generator in a separate step - throwing and catching GeneratorExit - getting a value from close() (I did have a lot of use for send(), throw(), and extracting a value from StopIteration.) In my context, generators are used to emulate concurrently running tasks, and "yield" is always used to mean "block until this piece of async I/O is complete, and wake me up with the result". This is similar to the "classic" trampoline code found in PEP 342. In fact, when I wrote the example for this thread, I fumbled a bit because the use of generators there is different than I had been using them (though it was no doubt thanks to having worked with them intensely that I came up with the example quickly). So, it is clear that generators are extremely versatile, and PEP 380 deserves several good use cases to explain all the API subtleties. BTW, while I have you, what do you think of Greg's "cofunctions" proposal? -- --Guido van Rossum (python.org/~guido)
On 10/25/2010 10:13 AM, Guido van Rossum wrote:
[Changed subject]
On 2010-10-25 04:37, Guido van Rossum wrote:
This should not require threads.
Here's a bare-bones sketch using generators: [...]
On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm
wrote: If you don't care about allowing the funcs to raise StopIteration, this can actually be simplified to: [...]
Indeed, I realized this after posting. :-) I had several other ideas for improvements, e.g. being able to pass an initial value to the reduce-like function or even being able to supply a reduce-like function of one's own.
More interesting (to me at least) is that this is an excellent example of why I would like to see a version of PEP380 where "close" on a generator can return a value (AFAICT the version of PEP380 on http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not mention this possibility, or even link to the heated discussion we had on python-ideas around march/april 2009).
Can you dig up the link here?
I recall that discussion but I don't recall a clear conclusion coming from it -- just heated debate.
Based on my example I have to agree that returning a value from close() would be nice. There is a little detail, how multiple arguments to StopIteration should be interpreted, but that's not so important if it's being raised by a return statement.
Assuming that "close" on a reduce_collector generator instance returns the value of the StopIteration raised by the "return" statements, we can simplify the code even further:
def reduce_collector(func): try: outcome = yield except GeneratorExit: return None while True: try: val = yield except GeneratorExit: return outcome outcome = func(outcome, val)
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for coll in collectors: next(coll) for val in iterable: for coll in collectors: coll.send(val) return [coll.close() for coll in collectors]
Yes, this is only saving a few lines, but I find it *much* more readable...
I totally agree that not having to call throw() and catch whatever it bounces back is much nicer. (Now I wish there was a way to avoid the "try..except GeneratorExit" construct in the generator, but I think I should stop while I'm ahead. :-)
This is how my mind wants to write this. @consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError: # next was called not send. yield = value def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors] It nicely separates input and output parts of a co-function, which can be tricky to get right when you have to receive and send at the same yield. Maybe in Python 4k? Oh well. :-)
The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate:
- the need to "prime" the generator in a separate step
Having a consumer decorator would be good. def consumer(f): @wraps(f) def wrapper(*args, **kwds): coroutine = f(*args, **kwds) next(coroutine) return coroutine return wrapper Or maybe it would be possible for python to autostart a generator if it's sent a value before it's started? Currently you get an almost useless TypeError. The reason it's almost useless is unless you are testing for it right after you create the generator, you can't (easily) be sure it's not from someplace inside the generator. Ron
Minor correction... On 10/25/2010 02:53 PM, Ron Adam wrote:
@consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError: # next was called not send. yield = value
This line should have been "yield value" not "yield = value".
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors]
On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam
This is how my mind wants to write this.
@consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError:
IIUC this works today if you substitute GeneratorExit and use c.close() instead of next(c) below. (I don't recall why I split it out into two different try/except blocks but it doesn't seem necessary. As for being able to distinguish next(c) from c.send(None), that's a few language revisions too late. Perhaps more to the point, I don't like that idea; it breaks the general treatment of things that return None and throwing away values. (Long, long, long ago there were situations where Python balked when you threw away a non-None value. The feature was boohed off the island and it's better this way.)
# next was called not send. yield value
I object to overloading yield for both a *resumable* operation and returning a (final) value; that's why PEP 380 will let you write "return value". (Many alternatives were considered but we always come back to the simple "return value".)
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors]
I really object to using next() for both getting the return value and the next yielded value. Jacob's proposal to spell this as c.close() sounds much better to me.
It nicely separates input and output parts of a co-function, which can be tricky to get right when you have to receive and send at the same yield.
I don't think there was a problem with this in my code (or if there was you didn't solve it).
Maybe in Python 4k? Oh well. :-)
Nah.
The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate:
- the need to "prime" the generator in a separate step
Having a consumer decorator would be good.
def consumer(f): @wraps(f) def wrapper(*args, **kwds): coroutine = f(*args, **kwds) next(coroutine) return coroutine return wrapper
This was proposed during the PEP 380 discussions. I still don't like it because I can easily imagine situations where sending an initial None falls totally naturally out of the sending logic (as it does for my async tasks use case), and it would be a shame if the generator's declaration prevented this.
Or maybe it would be possible for python to autostart a generator if it's sent a value before it's started? Currently you get an almost useless TypeError. The reason it's almost useless is unless you are testing for it right after you create the generator, you can't (easily) be sure it's not from someplace inside the generator.
I'd be okay with this raising a different exception (though for compatibility it would have to subclass TypeError). I'd also be okay with having a property on generator objects that let you inspect the state. There should really be three states: not yet started, started, finished -- and of course "started and currently executing" but that one is already exposed via g.gi_running. Changing the behavior on .send(val) doesn't strike me as a good idea, because the caller would be missing the first value yielded! IOW I want to support this use case but not make it the central driving use case for the API design. -- --Guido van Rossum (python.org/~guido)
On 10/25/2010 03:21 PM, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam
wrote: This is how my mind wants to write this.
@consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError:
IIUC this works today if you substitute GeneratorExit and use c.close() instead of next(c) below. (I don't recall why I split it out into two different try/except blocks but it doesn't seem necessary.
I tried it, c.close() doesn't work yet, but it does work with c.throw(GeneratorExit) :-) But that still uses yield to get the value. I used a different way of starting the generator that checks for a value being yielded. class GeneratorStartError(TypeError): pass def start(g): value = next(g) if value is not None: raise GeneratorStartError('started generator yielded a value') return g def reduce_collector(func): value = None try: value = yield while True: value = func((yield), value) except GeneratorExit: yield value def parallel_reduce(iterable, funcs): collectors = [start(reduce_collector(func)) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [c.throw(GeneratorExit) for c in collectors] def main(): it = range(100) print(parallel_reduce(it, [min, max])) if __name__ == '__main__': main()
As for being able to distinguish next(c) from c.send(None), that's a few language revisions too late. Perhaps more to the point, I don't like that idea; it breaks the general treatment of things that return None and throwing away values. (Long, long, long ago there were situations where Python balked when you threw away a non-None value. The feature was boohed off the island and it's better this way.)
I'm not sure I follow the relationship you suggest. No values would be thrown away. Or did you mean that it should be ok to throw away values? I don't think it would prevent that either. What the YieldError case really does is give the generator a bit more control. As far as the calling routine that uses it is concerned, it just works. What happend inside the generator is completely transparent to the routine using the generator. If the calling routine does see a YieldError, it means it probably was a bug.
# next was called not send. yield value
I object to overloading yield for both a *resumable* operation and returning a (final) value; that's why PEP 380 will let you write "return value". (Many alternatives were considered but we always come back to the simple "return value".)
That works for me. I think lot of people will find it easy to learn.
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors]
I really object to using next() for both getting the return value and the next yielded value. Jacob's proposal to spell this as c.close() sounds much better to me.
If c.close also throws the GeneratorExit and returns a value, that would be cool. Thanks. I take it that the objections have more to do with style and coding practices rather than what is possible.
It nicely separates input and output parts of a co-function, which can be tricky to get right when you have to receive and send at the same yield.
I don't think there was a problem with this in my code (or if there was you didn't solve it).
There wasn't in this code. This is one of those areas where it can be really difficult to find the correct way to express a co-function that does both input and output, but not necessarily in a fixed order. I begin almost any co-function with this at the top of the loop and later trim it up if parts of it aren't needed. out_value = None while True: in_value = yield out_value out_value = None ... # rest of loop to check in_value and modify out_value As long as None isn't a valid data item, this works most of the time.
Maybe in Python 4k? Oh well. :-)
Nah.
I'm ok with that. Ron
On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam
On 10/25/2010 03:21 PM, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam
wrote: This is how my mind wants to write this.
@consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError:
IIUC this works today if you substitute GeneratorExit and use c.close() instead of next(c) below. (I don't recall why I split it out into two different try/except blocks but it doesn't seem necessary.
I tried it, c.close() doesn't work yet, but it does work with c.throw(GeneratorExit) :-) But that still uses yield to get the value.
Yeah, sorry, I didn't mean to say that g.close() would return the value, but that you can use GeneratorExit here. g.close() *does* throw GeneratorExit (that's PEP 342); but it doesn't return the value yet. I like adding that to PEP 380 though.
I used a different way of starting the generator that checks for a value being yielded.
class GeneratorStartError(TypeError): pass
def start(g): value = next(g) if value is not None: raise GeneratorStartError('started generator yielded a value') return g
Whatever tickles your fancy. I just don't think this deserves a builtin.
def reduce_collector(func): value = None try: value = yield while True: value = func((yield), value) except GeneratorExit: yield value
Even today, I would much prefer using raise StopIteration(value) over yield value (or yield Return(value)). Reusing yield to return a value just looks wrong to me, there are too many ways to get confused (and this area doesn't need more of that :-).
def parallel_reduce(iterable, funcs): collectors = [start(reduce_collector(func)) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [c.throw(GeneratorExit) for c in collectors]
def main(): it = range(100) print(parallel_reduce(it, [min, max]))
if __name__ == '__main__': main()
As for being able to distinguish next(c) from c.send(None), that's a few language revisions too late. Perhaps more to the point, I don't like that idea; it breaks the general treatment of things that return None and throwing away values. (Long, long, long ago there were situations where Python balked when you threw away a non-None value. The feature was boohed off the island and it's better this way.)
I'm not sure I follow the relationship you suggest. No values would be thrown away. Or did you mean that it should be ok to throw away values? I don't think it would prevent that either.
Well maybe I was misunderstanding your proposed YieldError. You didn't really explain it -- you just used it and assumed everybody understood what you meant. My assumption was that you meant for YieldError to be raised if yield was used as an expression (not a statement) but next() was called instead of send(). My response was that it's ugly to make a distinction between x = <expr> del x # Or just not use x and <expr> But maybe I misunderstood what you meant.
What the YieldError case really does is give the generator a bit more control. As far as the calling routine that uses it is concerned, it just works. What happend inside the generator is completely transparent to the routine using the generator. If the calling routine does see a YieldError, it means it probably was a bug.
That sounds pretty close to the rules for GeneratorExit.
# next was called not send. yield value
I object to overloading yield for both a *resumable* operation and returning a (final) value; that's why PEP 380 will let you write "return value". (Many alternatives were considered but we always come back to the simple "return value".)
That works for me. I think lot of people will find it easy to learn.
def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors]
I really object to using next() for both getting the return value and the next yielded value. Jacob's proposal to spell this as c.close() sounds much better to me.
If c.close also throws the GeneratorExit and returns a value, that would be cool. Thanks.
It does throw GeneratorExit (that's the whole reason for GeneratorExit's existence :-).
I take it that the objections have more to do with style and coding practices rather than what is possible.
Yeah, it's my gut and that's hard to reason with but usually right. (See also: http://www.amazon.com/How-We-Decide-Jonah-Lehrer/dp/0618620117 )
It nicely separates input and output parts of a co-function, which can be tricky to get right when you have to receive and send at the same yield.
I don't think there was a problem with this in my code (or if there was you didn't solve it).
There wasn't in this code. This is one of those areas where it can be really difficult to find the correct way to express a co-function that does both input and output, but not necessarily in a fixed order.
Maybe for that one should use a "channel" abstraction, like Go (and before it, CSP)? I noticed that Monocle (http://github.com/saucelabs/monocle) has a demo of that in its "experimental" module (but the example is kind of silly).
I begin almost any co-function with this at the top of the loop and later trim it up if parts of it aren't needed.
out_value = None while True: in_value = yield out_value out_value = None ... # rest of loop to check in_value and modify out_value
As long as None isn't a valid data item, this works most of the time.
Maybe in Python 4k? Oh well. :-)
Nah.
I'm ok with that.
Ron
-- --Guido van Rossum (python.org/~guido)
On 2010-10-25 17:13, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm
wrote: More interesting (to me at least) is that this is an excellent example of why I would like to see a version of PEP380 where "close" on a generator can return a value (AFAICT the version of PEP380 on http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not mention this possibility, or even link to the heated discussion we had on python-ideas around march/april 2009).
Can you dig up the link here?
I recall that discussion but I don't recall a clear conclusion coming from it -- just heated debate.
Well here is a recap of the end of the discussion about how to handle generator return values and g.close(). Gregs conclusion that g.close() should not return a value: http://mail.python.org/pipermail/python-ideas/2009-April/003959.html My reply (ordered list of ways to handle return values in generators): http://mail.python.org/pipermail/python-ideas/2009-April/003984.html Some arguments for storing the return value on the generator: http://mail.python.org/pipermail/python-ideas/2009-April/004008.html Some support for that idea from Nick: http://mail.python.org/pipermail/python-ideas/2009-April/004012.html You're not convinced by Gregs argument: http://mail.python.org/pipermail/python-ideas/2009-April/003985.html Greg arguing that using GeneratorExit this way is bad: http://mail.python.org/pipermail/python-ideas/2009-April/004001.html You add a new complete proposal including g.close() returning a value: http://mail.python.org/pipermail/python-ideas/2009-April/003944.html I point out some problems e.g. with the handling of return values: http://mail.python.org/pipermail/python-ideas/2009-April/003981.html Then the discussion goes on at length about the problems of using a coroutine decorator with yield-from. At one point I am arguing for generators to keep a reference to the last value yielded: http://mail.python.org/pipermail/python-ideas/2009-April/004032.html And you reply that storing "unnatural" state on the generator or frame object is a bad idea: http://mail.python.org/pipermail/python-ideas/2009-April/004034.html From which I concluded that having g.close() return a value (the same on each successive call) would be a no-go: http://mail.python.org/pipermail/python-ideas/2009-April/004040.html Which you confirmed: http://mail.python.org/pipermail/python-ideas/2009-April/004041.html The latest draft (#13) I have been able to find was announced in http://mail.python.org/pipermail/python-ideas/2009-April/004189.html And can be found at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/... I had some later suggestions for how to change the expansion, see e.g. http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I find that version easier to reason about even now 1½ years later)
Based on my example I have to agree that returning a value from close() would be nice. There is a little detail, how multiple arguments to StopIteration should be interpreted, but that's not so important if it's being raised by a return statement.
Right. I would assume that the return value of g.close() if we ever got one was to be taken from the first argument to the StopIteration. What killed the proposal last time was the question of what should happen when you call g.close() on an exhausted generator. My preferred solution was (and is) that the generator should save the value from the terminating StopIteration (or None if it ended by some other means) and that g.close() should return that value each time and g.next(), g.send() and g.throw() should raise a StopIteration with the value. Unless you have changed your position on storing the return value, that solution is dead in the water. For this use case we don't actually need to call close() on an exhausted generator so perhaps there is *some* use in only returning a value when the generator is actually running. Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise. def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit')
I totally agree that not having to call throw() and catch whatever it bounces back is much nicer. (Now I wish there was a way to avoid the "try..except GeneratorExit" construct in the generator, but I think I should stop while I'm ahead. :-)
The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate:
- the need to "prime" the generator in a separate step - throwing and catching GeneratorExit - getting a value from close()
(I did have a lot of use for send(), throw(), and extracting a value from StopIteration.)
I think these things (at least priming and close()) are mostly an issue when using coroutines from non-coroutines. That means it is likely to be common in small examples where you write the whole program, but less common when you are writing small(ish) parts of a larger framework. Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all.
In my context, generators are used to emulate concurrently running tasks, and "yield" is always used to mean "block until this piece of async I/O is complete, and wake me up with the result". This is similar to the "classic" trampoline code found in PEP 342.
In fact, when I wrote the example for this thread, I fumbled a bit because the use of generators there is different than I had been using them (though it was no doubt thanks to having worked with them intensely that I came up with the example quickly).
This sounds a lot like working in a "larger framework" to me. :)
So, it is clear that generators are extremely versatile, and PEP 380 deserves several good use cases to explain all the API subtleties.
I like your example because it matches the way I would have used generators to solve it. OTOH, it is not hard to rewrite parallel_reduce as a traditional function. In fact, the result is a bit shorter and quite a bit faster so it is not a good example of what you need generators for.
BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
I'll have to get back to you on that. - Jacob
On 10/25/2010 10:13 AM, Guido van Rossum wrote:
BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
Well, my .5 cents worth, for what it's worth. I'm still undecided. Because of the many optimazations python has had in the last year on speeding up attribute access, (thanks guys!), classes don't get penalized as much as they use to be. So I'd like to see some speed comparisons with using class's vs co-functions. I think the class's are much easier to use and may not be as slow as some may think. Ron
On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm
On 2010-10-25 17:13, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm
wrote: More interesting (to me at least) is that this is an excellent example of why I would like to see a version of PEP380 where "close" on a generator can return a value (AFAICT the version of PEP380 on http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not mention this possibility, or even link to the heated discussion we had on python-ideas around march/april 2009).
Can you dig up the link here?
I recall that discussion but I don't recall a clear conclusion coming from it -- just heated debate.
Well here is a recap of the end of the discussion about how to handle generator return values and g.close().
Thanks, very thorough!
Gregs conclusion that g.close() should not return a value: http://mail.python.org/pipermail/python-ideas/2009-April/003959.html
My reply (ordered list of ways to handle return values in generators): http://mail.python.org/pipermail/python-ideas/2009-April/003984.html
Some arguments for storing the return value on the generator: http://mail.python.org/pipermail/python-ideas/2009-April/004008.html
Some support for that idea from Nick: http://mail.python.org/pipermail/python-ideas/2009-April/004012.html
You're not convinced by Gregs argument: http://mail.python.org/pipermail/python-ideas/2009-April/003985.html
Greg arguing that using GeneratorExit this way is bad: http://mail.python.org/pipermail/python-ideas/2009-April/004001.html
You add a new complete proposal including g.close() returning a value: http://mail.python.org/pipermail/python-ideas/2009-April/003944.html
I point out some problems e.g. with the handling of return values: http://mail.python.org/pipermail/python-ideas/2009-April/003981.html
Then the discussion goes on at length about the problems of using a coroutine decorator with yield-from. At one point I am arguing for generators to keep a reference to the last value yielded: http://mail.python.org/pipermail/python-ideas/2009-April/004032.html
And you reply that storing "unnatural" state on the generator or frame object is a bad idea: http://mail.python.org/pipermail/python-ideas/2009-April/004034.html
From which I concluded that having g.close() return a value (the same on each successive call) would be a no-go: http://mail.python.org/pipermail/python-ideas/2009-April/004040.html
Which you confirmed: http://mail.python.org/pipermail/python-ideas/2009-April/004041.html
The latest draft (#13) I have been able to find was announced in http://mail.python.org/pipermail/python-ideas/2009-April/004189.html
And can be found at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/...
Hmm... It does look like the PEP editors dropped the ball on this one (or maybe Greg didn't mail it directly to them). It doesn't seem there are substantial differences with the published version at http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't return a value.
I had some later suggestions for how to change the expansion, see e.g. http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I find that version easier to reason about even now 1½ years later)
Hopefully you & Greg can agree on a new draft. I like this to make progress and I really want this to appear in 3.3. But I don't have the time to do the editing and reviewing of the PEP.
Based on my example I have to agree that returning a value from close() would be nice. There is a little detail, how multiple arguments to StopIteration should be interpreted, but that's not so important if it's being raised by a return statement.
Right. I would assume that the return value of g.close() if we ever got one was to be taken from the first argument to the StopIteration.
That's a reasonable position. Monocle currently makes it so that using yield Return(x, y, z) [which in my view should be spelled raise Return(x, y, z0] is equivalent to return x, y, z, but there's no real need if the latter syntax is actually supported.
What killed the proposal last time was the question of what should happen when you call g.close() on an exhausted generator. My preferred solution was (and is) that the generator should save the value from the terminating StopIteration (or None if it ended by some other means) and that g.close() should return that value each time and g.next(), g.send() and g.throw() should raise a StopIteration with the value. Unless you have changed your position on storing the return value, that solution is dead in the water.
I haven't changed my position. Closing a file twice doesn't do anything the second time either.
For this use case we don't actually need to call close() on an exhausted generator so perhaps there is *some* use in only returning a value when the generator is actually running.
:-)
Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise.
def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit')
You'll have to explain why None isn't sufficient.
I totally agree that not having to call throw() and catch whatever it bounces back is much nicer. (Now I wish there was a way to avoid the "try..except GeneratorExit" construct in the generator, but I think I should stop while I'm ahead. :-)
The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate:
- the need to "prime" the generator in a separate step - throwing and catching GeneratorExit - getting a value from close()
(I did have a lot of use for send(), throw(), and extracting a value from StopIteration.)
I think these things (at least priming and close()) are mostly an issue when using coroutines from non-coroutines. That means it is likely to be common in small examples where you write the whole program, but less common when you are writing small(ish) parts of a larger framework.
Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all.
Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better: def sum(): total = 0 try: while True: value = yield total += value finally: return total
In my context, generators are used to emulate concurrently running tasks, and "yield" is always used to mean "block until this piece of async I/O is complete, and wake me up with the result". This is similar to the "classic" trampoline code found in PEP 342.
In fact, when I wrote the example for this thread, I fumbled a bit because the use of generators there is different than I had been using them (though it was no doubt thanks to having worked with them intensely that I came up with the example quickly).
This sounds a lot like working in a "larger framework" to me. :)
Possibly. I realize that I have code something like this: next_input = None while ...not done yet...: output = gen.send(next_input) next_input = ...computed from output... # many variations which quite naturally computes next_input from output but it does start out with an initial value of None for next_input in order to prime the pump.
So, it is clear that generators are extremely versatile, and PEP 380 deserves several good use cases to explain all the API subtleties.
I like your example because it matches the way I would have used generators to solve it. OTOH, it is not hard to rewrite parallel_reduce as a traditional function. In fact, the result is a bit shorter and quite a bit faster so it is not a good example of what you need generators for.
I'm not sure I understand. Maybe you meant to rewrite it as a class? There's some state that wouldn't have a good place to live without either a class or a (generator) stackframe to survive.
BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
I'll have to get back to you on that.
- Jacob
-- --Guido van Rossum (python.org/~guido)
By the way, here's how to emulate the value-returning-close() on a generator, assuming the generator uses raise StopIteration(x) to mean return x: def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration, err: if err.args: return err.args[0] except GeneratorExit: pass return None I like this because it's fairly straightforward (except for the detail of having to also catch GeneratorExit). In fact it would be a really simple change to gen_close() in genobject.c -- the only change needed there would be to return err.args[0]. I like small evolutionary improvements to APIs. -- --Guido van Rossum (python.org/~guido)
On 10/25/2010 08:34 PM, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam
wrote: On 10/25/2010 03:21 PM, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam
wrote: This is how my mind wants to write this.
@consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError:
Well maybe I was misunderstanding your proposed YieldError. You didn't really explain it -- you just used it and assumed everybody understood what you meant.
Sorry about that, it is too easy to think something is clear on these boards when in fact it's isn't as clear as we (I in this case) think it is. hmmm ... I feel a bit embarrassed because I wasn't really meaning to try to convince you to do this. It's just what first came to mind when I asked myself, "if there was an easier way to write it, how would I do it?". As you pointed out, it isn't that much different from the c.close() example Jacob gave. To me, that is a nice indication that you (and Jacob and Greg) are on the right track. I think YieldError is an interesting concept, but it requires too many changes to make it work. ( I just wish I could be of more help here. :-/ ) Cheers, Ron
Guido van Rossum wrote:
I like your example because it matches the way I would have used generators to solve it. OTOH, it is not hard to rewrite parallel_reduce as a traditional function. In fact, the result is a bit shorter and quite a bit faster so it is not a good example of what you need generators for.
I'm not sure I understand. Maybe you meant to rewrite it as a class? There's some state that wouldn't have a good place to live without either a class or a (generator) stackframe to survive.
How about def parallel_reduce(items, funcs): items = iter(items) try: first = next(items) except StopIteration: raise TypeError accu = [first] * len(funcs) for b in items: accu = [f(a, b) for f, a in zip(funcs, accu)] return accu Peter
On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum
On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm
wrote: Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all.
Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better:
def sum(): total = 0 try: while True: value = yield total += value finally: return total
Rereading my previous post that Jacob linked, I'm still a little uncomfortable with the idea of people deliberately catching GeneratorExit to turn it into a normal value return to be reported by close(). That said, I'm even less comfortable with the idea of encouraging the moral equivalent of a bare except clause :) I see two realistic options here: 1. Use GeneratorExit for this, have g.close() return a value and I (and others that agree with me) just get the heck over it. 2. Add a new GeneratorReturn exception and a new g.finish() method that follows the same basic algorithm Guido suggested, only with a different exception type: class GeneratorReturn(Exception): # Note: ordinary exception, unlike GeneratorExit pass def finish(gen): try: gen.throw(GeneratorReturn) raise RuntimeError("Generator ignored GeneratorReturn") except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass return None (Why "finish" as the suggested name for the method? I'd prefer "return", but that's a keyword and "return_" is somewhat ugly. Pairing GeneratorReturn with finish() is my second choice, for the "OK, time to wrap things up and complete your assigned task" connotations, as compared to the "drop everything and clean up the mess" connotations of GeneratorExit and close()) I'd personally be +1 on option 2 (since it addresses the immediate use case while maintaining appropriate separation of concerns between guaranteed resource cleanup and graceful completion of coroutines) and -0 on option 1 (unsurprising, given my previously stated objections to failing to maintain appropriate separation of concerns). (I should note that this differs from the previous suggestion of a GeneratorReturn exception in the context of PEP 380. Those suggestions were to use it as a replacement for StopIteration when a generator contained a return statement. The suggestion here is to instead use it as a replacement for GeneratorExit in order to request prompt-but-graceful completion of a generator rather than just bailing out immediately). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2010-10-26 05:14, Guido van Rossum wrote:
On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm
wrote: On 2010-10-25 17:13, Guido van Rossum wrote:
Can you dig up the link here?
Well here is a recap of the end of the discussion about how to handle generator return values and g.close().
Thanks, very thorough!
I had to read through it myself to remember what actually happened, and thought you (and the rest of the world) might as well benefit from the notes I made.
The latest draft (#13) I have been able to find was announced in http://mail.python.org/pipermail/python-ideas/2009-April/004189.html
And can be found at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/...
Hmm... It does look like the PEP editors dropped the ball on this one (or maybe Greg didn't mail it directly to them). It doesn't seem there are substantial differences with the published version at http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't return a value.
IIRC, there are a few minor semantic differences in how non-generators are handled. I haven't made a detailed comparison.
I had some later suggestions for how to change the expansion, see e.g. http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I find that version easier to reason about even now 1½ years later)
Hopefully you & Greg can agree on a new draft. I like this to make progress and I really want this to appear in 3.3. But I don't have the time to do the editing and reviewing of the PEP.
IIRC, this was just a presentation issue - the two expansions were supposed to be equivalent. It might become relevant if we want to change something in the definition, because we need a common base to discuss from. My version is (intended to be) simpler to reason about in the sense that things that should be handled the same are only written once.
What killed the proposal last time was the question of what should happen when you call g.close() on an exhausted generator. My preferred solution was (and is) that the generator should save the value from the terminating StopIteration (or None if it ended by some other means) and that g.close() should return that value each time and g.next(), g.send() and g.throw() should raise a StopIteration with the value. Unless you have changed your position on storing the return value, that solution is dead in the water.
I haven't changed my position. Closing a file twice doesn't do anything the second time either.
Ok
Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise.
def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit')
You'll have to explain why None isn't sufficient.
It is not really necessary, but seemed "cleaner" somehow. Think of "g.close(default)" as "get me the result if possible, and this default otherwise". Then think of dict.get()... An even cleaner solution might be Nicks "g.finish()" proposal, which I will comment on separately.
I think these things (at least priming and close()) are mostly an issue when using coroutines from non-coroutines. That means it is likely to be common in small examples where you write the whole program, but less common when you are writing small(ish) parts of a larger framework.
Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all.
Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better:
def sum(): total = 0 try: while True: value = yield total += value finally: return total
This is essentially the same as a bare except. I think there is general agreement that that is a bad idea.
So, it is clear that generators are extremely versatile, and PEP 380 deserves several good use cases to explain all the API subtleties.
I like your example because it matches the way I would have used generators to solve it. OTOH, it is not hard to rewrite parallel_reduce as a traditional function. In fact, the result is a bit shorter and quite a bit faster so it is not a good example of what you need generators for.
I'm not sure I understand. Maybe you meant to rewrite it as a class? There's some state that wouldn't have a good place to live without either a class or a (generator) stackframe to survive.
See the reply by Peter Otten (and my reply to him). You mentioned some possible extensions though. At a guess, at least some of these would benefit greatly from the use of generators. Maybe such an extension would be a better example? - Jacob
On 2010-10-26 12:36, Nick Coghlan wrote:
On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum
wrote: Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better:
def sum(): total = 0 try: while True: value = yield total += value finally: return total
Rereading my previous post that Jacob linked, I'm still a little uncomfortable with the idea of people deliberately catching GeneratorExit to turn it into a normal value return to be reported by close(). That said, I'm even less comfortable with the idea of encouraging the moral equivalent of a bare except clause :)
What Nick said. :)
I see two realistic options here:
1. Use GeneratorExit for this, have g.close() return a value and I (and others that agree with me) just get the heck over it.
This has the benefit of not needing an extra method/function and an extra exception for this style of programming. It still has the refactoring problem I mention below. That might be fixable in a similar way though. (Hmm thinking about this gives me a strong sense of deja-vu).
2. Add a new GeneratorReturn exception and a new g.finish() method that follows the same basic algorithm Guido suggested, only with a different exception type:
class GeneratorReturn(Exception): # Note: ordinary exception, unlike GeneratorExit pass
def finish(gen): try: gen.throw(GeneratorReturn) raise RuntimeError("Generator ignored GeneratorReturn") except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass return None
I like this. Having a separate function lets you explicitly request a return value and making it fail loudly when called on an exhausted generator feels just right given the prohibition against saving the "True" return value anywhere. Also, using a different exception lets the generator distinguish between the "close" and "finish" cases, and making it an ordinary exception makes it clear that it is *intended* to be caught. All good stuff. I am not sure that returning None when finish() cathes GeneratorReturn is a good idea though. If you call finish on a generator you expect it to do something about it and return a value. If the GeneratorReturn escapes, it is a sign that the generator was not written to expect this and so it likely an error. OTOH, I am not sure it always is so maybe allowing it is OK. I just don't know. How does it fit with the current PEP 380, and esp. the refactoring principle? It seems like we need to special-case the GeneratorReturn exception somehow. Perhaps like this: [...] try: _s = yield _y + except GeneratorReturn as _e: + try: + _m = _i.finish + except AttributeError: + raise _e # XXX RuntimeError? + raise YieldFromFinished(_m()) except GeneratorExit as _e: [...] Where YieldFromFinished inherits from GeneratorReturn, and has a 'value' attribute like the new StopIteration. Without something like this a function that is written to work with "finish" is unlikely to be refactorable. With this, the trivial case of perfect delegation can be written as: def outer(): try: return yield from inner() except YieldFromFinished as e: return e.value and a slightly more complex case... def outer2(): try: a = yield from innerA() except YieldFromFinished as e: return e.value try: b = yield from innerB() except YieldFromFinished as e: return a+e.value return a+b the "outer2" example shows why the special casing is needed. If outer2.finish() is called while outer2 is suspended in innerA, a GeneratorReturn would be thrown directly into innerA. Since innerA is supposed to be expecting this, it returns a value immediately which would then be the return value of the yield-from. outer2 would then erroneously continue to the "b = yield from innerB()" line, which unless innerB immediately raised StopIteration would yield a value causing the outer2.finish() to raise a RuntimeError... We can avoid the extra YieldFromFinished exception if we let the new GeneratorReturn exception grow a value attribute instead and use it for both purposes. But then the distinction between a GeneratorReturn that is thrown in by "finish" (which has no associated value) and the GeneratorReturn raised by the yield-from (which has) gets blurred a bit. Another idea is to actually replace YieldFromFinished with StopIteration or a GeneratorReturn inheriting from StopIteration. That would mean we could drop the first try-except block in each of the above example generators because the "finished" result from the inner function is returned directly anyway. On the other hand, that could easily lead to subtle bugs if you forget a try...except block that is actually needed, like the second block in outer2. A different way to handle this would be to change the PEP 380 expansion as follows: [...] - except GeneratorExit as _e: + except (GeneratorReturn, GeneratorExit) as _e: [...] What this means is that only the outermost generator would see the GeneratorReturn. If the outermost generator is suspended using yield-from, and finish() is called. The inner generator is simply closed and the GeneratorReturn re-raised. This version is only really useful for delegating to generators that *don't* return a value, but it is simpler and at least it allows *some* use of yield-from with "finish".
(Why "finish" as the suggested name for the method? I'd prefer "return", but that's a keyword and "return_" is somewhat ugly. Pairing GeneratorReturn with finish() is my second choice, for the "OK, time to wrap things up and complete your assigned task" connotations, as compared to the "drop everything and clean up the mess" connotations of GeneratorExit and close())
I like the names. GeneratorFinish might work as well for the exception, but I like GeneratorReturn better for its connection with "return".
I'd personally be +1 on option 2 (since it addresses the immediate use case while maintaining appropriate separation of concerns between guaranteed resource cleanup and graceful completion of coroutines) and -0 on option 1 (unsurprising, given my previously stated objections to failing to maintain appropriate separation of concerns).
I agree the "finish" idea looks far better for generators without yield-from. It is unfortunate that extending it to work with yield-from isn't prettier that it is though.
(I should note that this differs from the previous suggestion of a GeneratorReturn exception in the context of PEP 380. Those suggestions were to use it as a replacement for StopIteration when a generator contained a return statement. The suggestion here is to instead use it as a replacement for GeneratorExit in order to request prompt-but-graceful completion of a generator rather than just bailing out immediately).
I agree the name fits this use better than the original. Too bad some of my suggestions above are starting to blur the line between GeneratorReturn and StopIteration again. - Jacob
On Tue, Oct 26, 2010 at 3:36 AM, Nick Coghlan
On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum
wrote: On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm
wrote: Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all.
Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better:
def sum(): total = 0 try: while True: value = yield total += value finally: return total
Rereading my previous post that Jacob linked, I'm still a little uncomfortable with the idea of people deliberately catching GeneratorExit to turn it into a normal value return to be reported by close(). That said, I'm even less comfortable with the idea of encouraging the moral equivalent of a bare except clause :)
My bad. I should have stopped at "except GeneratorExit: return total".
I see two realistic options here:
1. Use GeneratorExit for this, have g.close() return a value and I (and others that agree with me) just get the heck over it.
This is still my preferred option.
2. Add a new GeneratorReturn exception and a new g.finish() method that follows the same basic algorithm Guido suggested, only with a different exception type:
class GeneratorReturn(Exception): # Note: ordinary exception, unlike GeneratorExit pass
def finish(gen): try: gen.throw(GeneratorReturn) raise RuntimeError("Generator ignored GeneratorReturn") except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass return None
IMO there are already too many special exceptions and methods.
(Why "finish" as the suggested name for the method? I'd prefer "return", but that's a keyword and "return_" is somewhat ugly. Pairing GeneratorReturn with finish() is my second choice, for the "OK, time to wrap things up and complete your assigned task" connotations, as compared to the "drop everything and clean up the mess" connotations of GeneratorExit and close())
I'd personally be +1 on option 2 (since it addresses the immediate use case while maintaining appropriate separation of concerns between guaranteed resource cleanup and graceful completion of coroutines) and -0 on option 1 (unsurprising, given my previously stated objections to failing to maintain appropriate separation of concerns).
Hm, I guess I'm more in favor of minimal mechanism. The clincher for me is pretty much that the extended g.close() semantics are a very simple mod to the existing gen_close() function in genobject.c -- it currently always returns None but could very easily be changed to extract the return value from err.args when it catches StopIteration (but not GeneratorExit). it also looks like my proposal doesn't get in the way of anything -- if the generator doesn't catch GeneratorExit g.close() will return None, and if the caller of g.close() doesn't expect a value, they can just ignore it. Finally note that this still looks like a relatively esoteric use case: when using "var = yield from generator()" the the return value from the generator (written as "return X" and implemented as "raise StopIteration(X)") will automatically be delivered to var, and there's no need to call g.close(). In this case there is also no reason for the generator to catch GeneratorExit -- that is purely needed for the idiom of writing "inside-out iterators" using this pattern in the generator (as I mentioned on the parent thread): try: while True: value = yield <use value> except GeneratorExit: raise StopIteration(<result>) # Or "return <result>" in PEP 380 syntax Now, if I may temporarily go into wild-and-crazy mode (this *is* python-ideas after all :-), we could invent some ad-hoc syntax for this pattern, e.g.: for value in yield: <use value> return <result> IOW the special form: for <var> in yield: <body> would translate into: try: while True: <var> = yield <body> except GeneratorExit: pass If (and this is a big if) the while-True-yield-inside-try-except-GeneratorExit pattern somehow becomes popular we could reconsider this syntactic extension or some variant. (I have to add that the syntactic ice is a bit thin here, since "for <var> in (yield)" already has a meaning, and a totally different one of course. A variant could be "for <var> from yield" or some other abuse of keywords. But let me stop here before people think I've just volunteered my retirement... :-)
(I should note that this differs from the previous suggestion of a GeneratorReturn exception in the context of PEP 380. Those suggestions were to use it as a replacement for StopIteration when a generator contained a return statement. The suggestion here is to instead use it as a replacement for GeneratorExit in order to request prompt-but-graceful completion of a generator rather than just bailing out immediately).
Noted. -- --Guido van Rossum (python.org/~guido)
On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm
Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise.
def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit')
You'll have to explain why None isn't sufficient.
It is not really necessary, but seemed "cleaner" somehow. Think of "g.close(default)" as "get me the result if possible, and this default otherwise". Then think of dict.get()...
Hm, I'd say there always is a result -- it just sometimes is None. I really don't want to make distinctions between falling off the end of the function, "return" without a value, "return None", "raise StopIteration()", "raise StopIteration(None)", or even (in response to a close() request) "raise GeneratorExit".
You mentioned some possible extensions though. At a guess, at least some of these would benefit greatly from the use of generators. Maybe such an extension would be a better example?
Yes, see the avg() example I posted in the parent thread. -- --Guido van Rossum (python.org/~guido)
On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm
I like this. Having a separate function lets you explicitly request a return value and making it fail loudly when called on an exhausted generator feels just right given the prohibition against saving the "True" return value anywhere. Also, using a different exception lets the generator distinguish between the "close" and "finish" cases, and making it an ordinary exception makes it clear that it is *intended* to be caught. All good stuff.
I don't know. There are places where failing loudly is the right thing to do (1 + 'a'). But when it comes to return values Python takes a pretty strong position that there's no difference between functions and procedures, that "return", "return None" and falling off the end all mean the same thing, and that it's totally fine to ignore a value or to return a value that will be of no interest for most callers.
I am not sure that returning None when finish() cathes GeneratorReturn is a good idea though. If you call finish on a generator you expect it to do something about it and return a value. If the GeneratorReturn escapes, it is a sign that the generator was not written to expect this and so it likely an error. OTOH, I am not sure it always is so maybe allowing it is OK. I just don't know.
How does it fit with the current PEP 380, and esp. the refactoring principle? It seems like we need to special-case the GeneratorReturn exception somehow. Perhaps like this:
[...] try: _s = yield _y + except GeneratorReturn as _e: + try: + _m = _i.finish + except AttributeError: + raise _e # XXX RuntimeError? + raise YieldFromFinished(_m()) except GeneratorExit as _e: [...]
Where YieldFromFinished inherits from GeneratorReturn, and has a 'value' attribute like the new StopIteration.
Without something like this a function that is written to work with "finish" is unlikely to be refactorable. With this, the trivial case of perfect delegation can be written as:
def outer(): try: return yield from inner() except YieldFromFinished as e: return e.value
and a slightly more complex case...
def outer2(): try: a = yield from innerA() except YieldFromFinished as e: return e.value try: b = yield from innerB() except YieldFromFinished as e: return a+e.value return a+b
the "outer2" example shows why the special casing is needed. If outer2.finish() is called while outer2 is suspended in innerA, a GeneratorReturn would be thrown directly into innerA. Since innerA is supposed to be expecting this, it returns a value immediately which would then be the return value of the yield-from. outer2 would then erroneously continue to the "b = yield from innerB()" line, which unless innerB immediately raised StopIteration would yield a value causing the outer2.finish() to raise a RuntimeError...
We can avoid the extra YieldFromFinished exception if we let the new GeneratorReturn exception grow a value attribute instead and use it for both purposes. But then the distinction between a GeneratorReturn that is thrown in by "finish" (which has no associated value) and the GeneratorReturn raised by the yield-from (which has) gets blurred a bit.
Another idea is to actually replace YieldFromFinished with StopIteration or a GeneratorReturn inheriting from StopIteration. That would mean we could drop the first try-except block in each of the above example generators because the "finished" result from the inner function is returned directly anyway. On the other hand, that could easily lead to subtle bugs if you forget a try...except block that is actually needed, like the second block in outer2.
I'm afraid that all was too much to really reach my brain, which keeps telling me "he's commenting on Nick's proposal which I've already rejected".
A different way to handle this would be to change the PEP 380 expansion as follows:
[...] - except GeneratorExit as _e: + except (GeneratorReturn, GeneratorExit) as _e: [...]
That just strikes me as one more reason why a separate GeneratorReturn is a bad idea. In my ideal world, you almost never need to catch or raise StopIteration; you don't raise GeneratorExit (that is close()'s job) but you catch it to notice that your data source is finished, and then you return a value. (And see my crazy idea in my previous post to get rid of that too. :-)
What this means is that only the outermost generator would see the GeneratorReturn. If the outermost generator is suspended using yield-from, and finish() is called. The inner generator is simply closed and the GeneratorReturn re-raised. This version is only really useful for delegating to generators that *don't* return a value, but it is simpler and at least it allows *some* use of yield-from with "finish".
(Why "finish" as the suggested name for the method? I'd prefer "return", but that's a keyword and "return_" is somewhat ugly. Pairing GeneratorReturn with finish() is my second choice, for the "OK, time to wrap things up and complete your assigned task" connotations, as compared to the "drop everything and clean up the mess" connotations of GeneratorExit and close())
I like the names. GeneratorFinish might work as well for the exception, but I like GeneratorReturn better for its connection with "return".
I'd personally be +1 on option 2 (since it addresses the immediate use case while maintaining appropriate separation of concerns between guaranteed resource cleanup and graceful completion of coroutines) and -0 on option 1 (unsurprising, given my previously stated objections to failing to maintain appropriate separation of concerns).
I agree the "finish" idea looks far better for generators without yield-from. It is unfortunate that extending it to work with yield-from isn't prettier that it is though.
(I should note that this differs from the previous suggestion of a GeneratorReturn exception in the context of PEP 380. Those suggestions were to use it as a replacement for StopIteration when a generator contained a return statement. The suggestion here is to instead use it as a replacement for GeneratorExit in order to request prompt-but-graceful completion of a generator rather than just bailing out immediately).
I agree the name fits this use better than the original. Too bad some of my suggestions above are starting to blur the line between GeneratorReturn and StopIteration again.
So now I'm even more convinced that it's not worth it... -- --Guido van Rossum (python.org/~guido)
On Wed, Oct 27, 2010 at 3:33 AM, Guido van Rossum
On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm
wrote: A different way to handle this would be to change the PEP 380 expansion as follows:
[...] - except GeneratorExit as _e: + except (GeneratorReturn, GeneratorExit) as _e: [...]
That just strikes me as one more reason why a separate GeneratorReturn is a bad idea.
In my ideal world, you almost never need to catch or raise StopIteration; you don't raise GeneratorExit (that is close()'s job) but you catch it to notice that your data source is finished, and then you return a value. (And see my crazy idea in my previous post to get rid of that too. :-)
Jacob's "implications for PEP 380" exploration started to give me some doubts, but I think there are actually some flaws in his argument. Accordingly, I would like to make one more attempt at explaining why I think throwing in a separate exception for this use case is valuable (and *doesn't* require any changes to PEP 380). As I see it, there's a bit of a disconnect between many PEP 380 use cases and any mechanism or idiom which translates a thrown in exception into an ordinary StopIteration. If you expect your thrown in exception to always terminate the generator in some fashion, adopting the latter idiom in your generator will make it potentially unsafe to use in a "yield from" expression that isn't the very last yield operation in any outer generator. Consider the following: def example(arg): try: yield arg except GeneratorExit return "Closed" return "Finished" def outer_ok1(arg): # close() after next() returns "Closed" return yield from example(arg) def outer_ok2(arg): # close() after next() returns None yield from example(arg) def outer_broken(arg): # close() after next() gives RuntimeError val = yield from example(arg) yield val # All 3 cases: close() before next() returns None # All 3 cases: close() after 2x next() returns None Using close() to say "give me your return value" creates the risk of hitting those runtime errors in a generator's __del__ method, and exceptions in __del__ are always a bit ugly. Keeping the "give me your return value" and "clean up your resources" concerns separate by adding a new method and thrown exception means that close() is less likely to unpredictably raise RuntimeError (and when it does, will reliably indicate a genuine bug in a generator somewhere that is suppressing GeneratorExit). As far as PEP 380's semantics go, I think it should ignore the existence of anything like GeneratorReturn completely. Either one of the generators in the chain will catch the exception and turn it into StopIteration, or they won't. If they convert it to StopIteration, and they aren't the last generator in the chain, then maybe what actually needs to happen at the outermost level is something like this: class GeneratorReturn(Exception): pass def finish(gen): try: gen.throw(GeneratorReturn) # Ask generator to wrap things up except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass else: # Asking nicely didn't work, so force resource cleanup # and treat the result as if the generator had already # been exhausted or hadn't started yet gen.close() return None Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2010-10-26 19:01, Guido van Rossum wrote:
On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm
wrote: [...] Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise.
def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit')
You'll have to explain why None isn't sufficient.
It is not really necessary, but seemed "cleaner" somehow. Think of "g.close(default)" as "get me the result if possible, and this default otherwise". Then think of dict.get()...
Hm, I'd say there always is a result -- it just sometimes is None. I really don't want to make distinctions between falling off the end of the function, "return" without a value, "return None", "raise StopIteration()", "raise StopIteration(None)", or even (in response to a close() request) "raise GeneratorExit".
None of these cover the distinction I am making. I want to distinguish between a non-exhausted and an exhausted generator. When calling close on a non-exhausted generator, the generator decides how to return by any one of the means you mentioned. In this case you are right that there is always a result. When calling close on an exhausted generator, the generator has no choice in the matter as the "true" return value was thrown away. We have to return *something*, but calling it the "result" of the generator is stretching it too far. Making it possible to return something other than None in this case seems to be analogous to dict.get(). If we chose to use a different method (e.g. Nicks "finish") for getting the "result", I would instead raise a RuntimeError when calling it on an exhausted generator. i.o.w, I would want it defined something like this: def finish(self): if self.gi_frame is None: raise RuntimeError('generator already finished') try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None # XXX debatable but unimportant to me else: raise RuntimeError('generator ignored GeneratorExit') (possibly using a new GeneratorReturn exception instead) You might argue for using a different exception for signaling the exhausted case, e.g.: class GeneratorFinishedError(StandardError): """finish() called on exhaused generator.""" but that only really makes sense if you think calling finish without knowing whether the generator is exhausted is a reasonable thing to do. *If* that is the case, we should also consider adding a 'default' argument to finish which (if provided) could be returned instead of raising the exception (kind of like dict.pop). - Jacob
On 10/25/2010 10:25 PM, Guido van Rossum wrote:
By the way, here's how to emulate the value-returning-close() on a generator, assuming the generator uses raise StopIteration(x) to mean return x:
def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration, err: if err.args: return err.args[0] except GeneratorExit: pass return None
I like this because it's fairly straightforward (except for the detail of having to also catch GeneratorExit).
In fact it would be a really simple change to gen_close() in genobject.c -- the only change needed there would be to return err.args[0]. I like small evolutionary improvements to APIs.
Here's an interesting idea... It looks like a common case for consumer co-functions is they need to be started and then closed, so I'm wondering if we can make these work context managers? That may be a way to reduce the need for the try/except blocks inside the generators. with my_cofunction(args) as c: ... use c Regards, Ron
On 10/27/2010 10:01 AM, Ron Adam wrote:
On 10/25/2010 10:25 PM, Guido van Rossum wrote:
By the way, here's how to emulate the value-returning-close() on a generator, assuming the generator uses raise StopIteration(x) to mean return x:
def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration, err: if err.args: return err.args[0] except GeneratorExit: pass return None
I like this because it's fairly straightforward (except for the detail of having to also catch GeneratorExit).
In fact it would be a really simple change to gen_close() in genobject.c -- the only change needed there would be to return err.args[0]. I like small evolutionary improvements to APIs.
Here's an interesting idea...
It looks like a common case for consumer co-functions is they need to be started and then closed, so I'm wondering if we can make these work context managers? That may be a way to reduce the need for the try/except blocks inside the generators.
It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that? Here's a context manager version of the min/max with nested coroutines, but it doesn't return a value from close. ###### from contextlib import contextmanager # New close function that enables returning a # value. def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] except GeneratorExit: pass return None # Showing both the class and geneator based # context managers for comparison and to better # see how these things may work. class Consumer: def __init__(self, cofunc): next(cofunc) self.cofunc = cofunc def __enter__(self): return self.cofunc def __exit__(self, *exc_info): gclose(self.cofunc) @contextmanager def consumer(cofunc): next(cofunc) try: yield cofunc finally: gclose(cofunc) class MultiConsumer: def __init__(self, cofuncs): for c in cofuncs: next(c) self.cofuncs = cofuncs def __enter__(self): return self.cofuncs def __exit__(self, *exc_info): for c in self.cofuncs: gclose(c) @contextmanager def multiconsumer(cofuncs): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: gclose(c) # Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators. # Question: # How to rewrite this so close returns # a final value? def reduce_i(f): i = yield while True: i = f(i, (yield i)) def reduce_it_to(funcs): with multiconsumer([reduce_i(f) for f in funcs]) as mc: values = None while True: i = yield values values = [c.send(i) for c in mc] def main(): with consumer(reduce_it_to([min, max])) as c: for i in range(100): value = c.send(i) print(value) if __name__ == '__main__': main()
On 2010-10-26 18:56, Guido van Rossum wrote:
Now, if I may temporarily go into wild-and-crazy mode (this *is* python-ideas after all :-), we could invent some ad-hoc syntax for this pattern, e.g.:
for value in yield: <use value> return <result>
IOW the special form:
for <var> in yield: <body>
would translate into:
try: while True: <var> = yield <body> except GeneratorExit: pass
If (and this is a big if) the while-True-yield-inside-try-except-GeneratorExit pattern somehow becomes popular we could reconsider this syntactic extension or some variant. (I have to add that the syntactic ice is a bit thin here, since "for <var> in (yield)" already has a meaning, and a totally different one of course. A variant could be "for <var> from yield" or some other abuse of keywords.
Hmm. This got me thinking. One thing I'd really like to see in python is something like the "channel" object from the go language (http://golang.org/). Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without any of them) it is possible to write a trampoline-based implementation of a channel object with "send" and "next" methods that work as expected. One thing that is *not* possible (I think) is to make that object iterable. Your wild idea above gave me a similar wild idea of my own. An extension to the cofunctions PEP that would make that possible. 1) Define a new "coiterator" protocol, consisting of a new special method __conext__, and a new StopCoIteration exception that the regular StopIteration inherits from. __conext__ should be a generator that yields as many times as necessary, then either raises StopCoIteration or returns a result (possibly by raising a StopIteration with a value). Add a new built-in "conext" cofunction that looks for a __conext__ method instead of a __next__ method. 2) Define a new "coiterable" protocol, consisting of a new special method __coiter__. __coiter__ is a regular function and should return an object implementing the "coiterator" protocol. Add a new built-in "coiter" function that looks for a __coiter__ method instead of an __iter__ method. (We could also make this a cofunction but for now I don't see the point). 3) Make sure that the for-loop in a cofunction: for val in coiterable: <block> else: <block> expands as: _it = coiter(coiterable) while True: try: val = cocall conext(_it) except StopCoIteration: break <block> else: <block> Which is exactly the same as in a normal function, except for the use of "coiter" and "cocall conext" instead of "iter" and "next", and the use of StopCoIteration instead of StopIteration. 3a) Alternatively define a new syntax for "coiterating" that expands as in 3 and whose use is an alternative indicator that this is a cofunction. All this to make it possible to write a code like this: def consumer(ch): for val in ch: cocall print(val) # XXX need a cocall somewhere def producer(ch): for val in range(10): cocall ch.send(val) def main() sched = scheduler() ch = channel() sched.add(consumer(ch)) sched.add(producer(ch)) sched.run() Thoughts? - Jacob
On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam
On 10/27/2010 10:01 AM, Ron Adam wrote: It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that?
How would that value be communicated to the code containing the with-clause?
Here's a context manager version of the min/max with nested coroutines, but it doesn't return a value from close.
###### from contextlib import contextmanager
# New close function that enables returning a # value.
def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] except GeneratorExit: pass return None
# Showing both the class and geneator based # context managers for comparison and to better # see how these things may work.
class Consumer: def __init__(self, cofunc): next(cofunc) self.cofunc = cofunc def __enter__(self): return self.cofunc def __exit__(self, *exc_info): gclose(self.cofunc)
@contextmanager def consumer(cofunc): next(cofunc) try: yield cofunc finally: gclose(cofunc)
class MultiConsumer: def __init__(self, cofuncs): for c in cofuncs: next(c) self.cofuncs = cofuncs def __enter__(self): return self.cofuncs def __exit__(self, *exc_info): for c in self.cofuncs: gclose(c)
@contextmanager def multiconsumer(cofuncs): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: gclose(c)
So far so good.
# Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators.
# Question: # How to rewrite this so close returns # a final value?
Change the function to catch GeneratorExit and when it catches that, raise StopIteration(<returnvalue>).
def reduce_i(f): i = yield while True: i = f(i, (yield i))
Unfortunately from here on till the end of your example my brain exploded.
def reduce_it_to(funcs): with multiconsumer([reduce_i(f) for f in funcs]) as mc: values = None while True: i = yield values values = [c.send(i) for c in mc]
Maybe you could have picked a better name than 'i' for this variable...
def main(): with consumer(reduce_it_to([min, max])) as c: for i in range(100): value = c.send(i) print(value)
I sort of get what you are doing here but I think you left one abstraction out. Something like this: def blah(it, funcs): with consumer(reduce_it_to(funcs) as c: for i in it: value = c.send(i) return value def main(): print(blah(range(100), [min, max]))
if __name__ == '__main__': main()
-- --Guido van Rossum (python.org/~guido)
On 2010-10-27 00:14, Nick Coghlan wrote:
Jacob's "implications for PEP 380" exploration started to give me some doubts, but I think there are actually some flaws in his argument.
I'm not sure I made much of an argument. I showed an example that assumed the change I was suggesting and explained what the problem would be without the change. Let me try another example: def filesum(fn): s = 0 with fd in open(fn): for line in fd: s += int(line) yield # be cooperative.. return s def multifilesum(): a = yield from filesum('fileA') b = yield from filesum('fileB') return a+b def main() g = multifilesum() for i in range(10): try: next(g) except StopIteration as e: r = e.value break else: r = g.finish() This tries to read at most 10 lines from 'fileA' + 'fileB' and returning their sums when interpreting each line as an integer. It works fine if there are at most 10 lines but is broken if 'fileA' has more than 10 lines. What's more, assuming latest PEP 380 + your "finish" and no other changes I don't see a simple way of fixing it. With my modification of your "finish" proposal you can add a few try...except blocks to the code and it will "just work (tm)"...
Accordingly, I would like to make one more attempt at explaining why I think throwing in a separate exception for this use case is valuable (and *doesn't* require any changes to PEP 380).
I am convinced that it does, at least if you want it to be useable with yield-from. But the same goes for any version that uses GeneratorExit.
As I see it, there's a bit of a disconnect between many PEP 380 use cases and any mechanism or idiom which translates a thrown in exception into an ordinary StopIteration. If you expect your thrown in exception to always terminate the generator in some fashion, adopting the latter idiom in your generator will make it potentially unsafe to use in a "yield from" expression that isn't the very last yield operation in any outer generator.
Right. This is the problem I'm trying to address by modifying the PEP expansion.
Consider the following:
def example(arg): try: yield arg except GeneratorExit return "Closed" return "Finished"
def outer_ok1(arg): # close() after next() returns "Closed" return yield from example(arg)
def outer_ok2(arg): # close() after next() returns None yield from example(arg)
def outer_broken(arg): # close() after next() gives RuntimeError val = yield from example(arg) yield val
# All 3 cases: close() before next() returns None # All 3 cases: close() after 2x next() returns None
Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications. In fact, in all 3 cases close() after next() would give None because the "inner" return value is discarded and the GeneratorExit reraised. Only when called directly would the inner "example" function return "Closed" on close() after next().
Using close() to say "give me your return value" creates the risk of hitting those runtime errors in a generator's __del__ method,
Not really. Returning a value from close with no other changes does not change the risk of that happening. Of course I *do* think other changes are necessary, but then we'll need to look at those before concluding they are a problem...
and exceptions in __del__ are always a bit ugly.
That they are.
Keeping the "give me your return value" and "clean up your resources" concerns separate by adding a new method and thrown exception means that close() is less likely to unpredictably raise RuntimeError (and when it does, will reliably indicate a genuine bug in a generator somewhere that is suppressing GeneratorExit).
As far as PEP 380's semantics go, I think it should ignore the existence of anything like GeneratorReturn completely. Either one of the generators in the chain will catch the exception and turn it into StopIteration, or they won't. If they convert it to StopIteration, and they aren't the last generator in the chain, then maybe what actually needs to happen at the outermost level is something like this:
class GeneratorReturn(Exception): pass
def finish(gen): try: gen.throw(GeneratorReturn) # Ask generator to wrap things up except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass else: # Asking nicely didn't work, so force resource cleanup # and treat the result as if the generator had already # been exhausted or hadn't started yet gen.close() return None
This, I don't like. If we have a distinct method for "finishing" a generator and getting a return value, I want it to tell me if the return value was arrived at in some other way. Preferably with an exception, as in: def finish(self): if self.gi_frame is None: raise RuntimeError('finish() on exhausted/closed generator') try: self.throw(GeneratorReturn) except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass else: raise RuntimeError('generator ignored GeneratorReturn') return None The point of "finish" as I see it is not the "closing" part, but the "give me a result" part. Anyway, I am (probably) not going to argue much further for this. The only new thing that is on the table here is the "finish" function, and using a new exception. The use of a new exception solves some of the issues that you and Greg had earlier, but leaves the problem of using a value-returning close/finish with yield-from. (And Guido doesn't like it). Since noone seems interested in even considering a change to the PEP 380 expansion to fix this, i don't really see a any more I can contribute at this point. - Jacob
On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam
It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that?
The return value from __exit__ is used to decide whether or not to suppress the exception (i.e. bool(__exit__()) == True will suppress the exception that was passed in). There are a few CMs in the test suite (test.support) that provide info about things that happened during their with statement - they all use the trick of returning a stateful object from __enter__, then modifying the attributes of that object in __exit__. I seem to recall the CM variants of unittest.TestCase.assertRaises* doing the same thing (so you can poke and prod at the raised exception yourself). warnings.catch_warnings also appends encountered warnings to a list returned by __enter__ when record=True. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm
Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications.
Ah, you're quite right - I'd completely forgotten about the GeneratorExit special-casing in the PEP 380 semantics, so I was arguing from a faulty premise. With that error corrected, I can happily withdraw my objection to idioms that convert GeneratorExit to StopIteration (since any yield from expressions will reraise the GeneratorExit in that case). The "did-it-really-finish?" question can likely be answered by slightly improving generator state introspection from the Python level (as I believe Guido suggested earlier in the thread). That way close() can keep the gist of its current semantics (return something if the generator ends up in an inactive state, raise RuntimeError if it yields another value), while frameworks can object to other unexpected states if they want to. As it turns out, the information on generator state is already there, just not in a particularly user friendly format ("not started" = "g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" = "g.gi_frame is None"). So, without any modifications at all to the current incarnation of PEP 380, it is already possible to write: def finish(gen): frame = gen.gi_frame if frame is None: raise RuntimeError('finish() on exhausted/closed generator') if frame.f_lasti == -1: raise RuntimeError('finish() on not yet started generator') try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] return None except GeneratorExit: pass else: raise RuntimeError('Generator ignored GeneratorExit') raise RuntimeError('Generator failed to return a value') I think I'm finally starting to understand *your* question/concern though. Given the current PEP 380 expansion, the above definition of finish() and the following two generators: def g_inner(): yield return "Hello world!" def g_outer(): yield (yield from g_inner()) You would get the following result (as g_inner converts GeneratorExit to StopIteration, then yield from propogates that up the stack):
g = g_outer() next(g) finish(g) "Hello world!"
Oops? I'm wondering if this part of the PEP 380 expansion: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise Should actually look like: if _e is _x[1]: raise if isinstance(_x[1], GeneratorExit): raise GeneratorExit(*_e.args) Once that distinction is made, you can more easily write helper functions and context managers that allow code to do the "right thing" according to the needs of a particular framework or application. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Oct 28, 2010 at 8:52 AM, Nick Coghlan
On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm
wrote: Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications.
Ah, you're quite right - I'd completely forgotten about the GeneratorExit special-casing in the PEP 380 semantics, so I was arguing from a faulty premise. With that error corrected, I can happily withdraw my objection to idioms that convert GeneratorExit to StopIteration (since any yield from expressions will reraise the GeneratorExit in that case).
Correction: they'll reraise StopIteration with the current PEP semantics, GeneratorExit with the proposed modification at the end of my last message. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick & Jacob, Unfortunately other things are in need of my attention and I am quickly lagging behind on this thread. I'll try to respond to some issues without specific quoting. If GeneratorReturn and finish() can be implemented in pure user code, then I think it should be up to every (framework) developer to provide their own API, using whatever constraints they chose. Without specific use cases it's hard to reason about API design. Still, I think it is reasonable to offer some basic behavior on the generator object, and I still think that the best compromise here is to let g.close() extract the return value from StopIteration if it catches it. If a framework decides not to use this, fine. For a user working without a framework this is still just a little nicer than having to figure out the required logic yourself. I am aware of four relevant states for generators. Here's how they work (in current Python): - initial state: execution is poised at the top of the function. g.throw() always bounces back the exception. g.close() moves it to the final state. g.next() starts it running. g.send() requires a None argument and is then the same as g.next(). - running state: the frame is active. none of g.next(), g.send(), g.throw() or g.close() work -- they all raise ValueError. - suspended state: execution is suspended at a yield. g.close() raises GeneratorExit and if the generator catches this it can do whatever it pleases. If it then raises StopIteration or GeneratorExit, g.close() is happy, if it raises another exception g.close() just passes that through, if it yields a value g.close() complains and raises RuntimeError(). - finished (exhausted) state: the generator has returned. g.close() always return None. g.throw() always bounces back the exception. g.next() and g.send() always raise StopIteration. I would be in favor of adding an introspection API to distinguish these four states and I think it would be a fine thing to add to Python 3.2 if anyone finds the time to produce a patch (Nick? You showed what these boil down to.) I note that in the initial state a generator has no choice in how to respond because it hasnt't yet had the opportunity to set up a try/except, so in this state it acts pretty much the same as in the exhausted state when receiving a throw() or close(). Regarding built-in syntax for Go-like channels, let's first see an implementation in userland become successful *or* see that it's impossible to write an efficient one before adding more to the language. Note that having a different expansion of a for-loop based on the run-time value or type of the iterable cannot be done -- the expansion can only vary based on the syntactic form. There are a few different conventions for using generators and yield-from; e.g. generators used as proper iterators with easy refactoring; generators used as tasks where yield X is used for blocking I/O operations; and generators used as "inverse generators" as in the parallel_reduce() example that initiated this thread. I don't particularly care about what kind of errors you get if a generator written for one convention is accidentally used by another convention, as long as it is made clear which convention is being used in each case. Frameworks/libraries can and probably should develop decorators to mark up the 2nd and 3rd conventions, but I don't think the *language* needs to go out of its way to enforce proper usage. -- --Guido van Rossum (python.org/~guido)
On 10/27/2010 01:38 PM, Guido van Rossum wrote:
On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam
wrote: On 10/27/2010 10:01 AM, Ron Adam wrote: It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that?
How would that value be communicated to the code containing the with-clause?
I think that was what I was trying to figure out also.
def reduce_i(f): i = yield while True: i = f(i, (yield i))
Unfortunately from here on till the end of your example my brain exploded.
Mine did too, but I think it was a useful but strange experience. ;-) It forced me to take a break and think about the problem from a different viewpoint. Heres the conclusion I came to, but be forewarned, it's kind of anti-climatic. :-) The use of an exception to signal some bit of code, is a way to reach over a wall that also protects that bit of code. This seems to be a more common need when using coroutines, because it's more common to have some bits of code indirectly, direct some other bit of code. Generators already have a nice .throw() method that will return the value at the next yield. But we either have to choose an existing exception to throw, that has some other purpose, or make up a new one. When it comes to making up new ones, lots of other programmers may each call it something else. That isn't a big problem, but it may be nice if we had a standard exception for saying.. "Hey you!, send me a total or subtotal!". And that's all that it does. For now lets call it a ValueRequest exception. ValueRequest makes sense if you are throwing an exception, I think ValueReturn may make more sense if you are raising an exception. Or maybe there is something that reads well both ways? These both fit very nice with ValueError and it may make reading code easier if we make a distinction between a request and a return. Below is the previous example rewritten to do this. A ValueRequest doesn't stop anything or force anything to close, so it wont ever interfere, confuse, or complicate, code that uses other exceptions. You can always throw or catch one of these and raise something else if you need to. Since throwing it into a generator doesn't stop the generator, the generator can put the try-except into a larger loop and loop back to get more values and catch another ValueRequest at some later point. I feel that is a useful and handy thing to do. So here's the example again. The first version of this took advantage of yield's ability to send and get data at the same time to always send back an update (subtotal) to the parent routine. That's nearly free since a yield always sends something back anyway. (None if you don't give it something else.) But it's not always easy to do, or easy to understand if you do it. IE.. brain exploding stuff. In this version, data only flows into the coroutine until a ValueRequest exception is thrown at it, at which point it then yields back a total. *I can see where some routines may reverse the control, by throwing ValueReturns from the inside out, rather than ValueRequests from the outside in. Is it useful to distinquish between the two or should there be just one? *Yes this can be made to work with gclose() and return, but I feel that is more restrictive, and more complex, than it needs to be. *I still didn't figure out how to use the context managers to get rid of the try except. Oh well. ;-) from contextlib import contextmanager class ValueRequest(Exception): pass @contextmanager def consumer(cofunc, result=True): next(cofunc) try: yield cofunc finally: cofunc.close() @contextmanager def multiconsumer(cofuncs, result=True): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: c.close() # Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators. def reduce_item(f): try: x = yield while True: x = f(x, (yield)) except ValueRequest: yield x def reduce_group(funcs): with multiconsumer([reduce_item(f) for f in funcs]) as mc: try: while True: x = yield for c in mc: c.send(x) except ValueRequest: yield [c.throw(ValueRequest) for c in mc] def get_reductions(funcs, iterable): with consumer(reduce_group(funcs)) as c: for x in iterable: c.send(x) return c.throw(ValueRequest) def main(): funcs = [min, max] print(get_reductions(funcs, range(100))) s = "Python is fun for play, and great for work too." print(get_reductions(funcs, s)) if __name__ == '__main__': main()
On 10/27/2010 05:00 PM, Nick Coghlan wrote:
On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam
wrote: It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that?
The return value from __exit__ is used to decide whether or not to suppress the exception (i.e. bool(__exit__()) == True will suppress the exception that was passed in).
There are a few CMs in the test suite (test.support) that provide info about things that happened during their with statement - they all use the trick of returning a stateful object from __enter__, then modifying the attributes of that object in __exit__. I seem to recall the CM variants of unittest.TestCase.assertRaises* doing the same thing (so you can poke and prod at the raised exception yourself). warnings.catch_warnings also appends encountered warnings to a list returned by __enter__ when record=True.
Cheers, Nick.
Thanks, I'll take a look. If for nothing else it will help me understand it better. BTW, The use case of the (min/max) examples doesn't fit that particular need. It turned out that just creating a custom exception and throwing it into the coroutine is probably the best and simplest way to do it. That's not to say that some of the other things Guido is thinking of won't benefit close() returning a value, but that particular example doesn't. Cheers, Ron
On Wed, Oct 27, 2010 at 5:00 PM, Ron Adam
On 10/27/2010 01:38 PM, Guido van Rossum wrote:
On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam
wrote: On 10/27/2010 10:01 AM, Ron Adam wrote: It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that?
How would that value be communicated to the code containing the with-clause?
I think that was what I was trying to figure out also.
def reduce_i(f): i = yield while True: i = f(i, (yield i))
Unfortunately from here on till the end of your example my brain exploded.
Mine did too, but I think it was a useful but strange experience. ;-)
It forced me to take a break and think about the problem from a different viewpoint. Heres the conclusion I came to, but be forewarned, it's kind of anti-climatic. :-)
The use of an exception to signal some bit of code, is a way to reach over a wall that also protects that bit of code. This seems to be a more common need when using coroutines, because it's more common to have some bits of code indirectly, direct some other bit of code.
Generators already have a nice .throw() method that will return the value at the next yield. But we either have to choose an existing exception to throw, that has some other purpose, or make up a new one. When it comes to making up new ones, lots of other programmers may each call it something else.
That isn't a big problem, but it may be nice if we had a standard exception for saying.. "Hey you!, send me a total or subtotal!". And that's all that it does. For now lets call it a ValueRequest exception.
ValueRequest makes sense if you are throwing an exception, I think ValueReturn may make more sense if you are raising an exception. Or maybe there is something that reads well both ways? These both fit very nice with ValueError and it may make reading code easier if we make a distinction between a request and a return.
Below is the previous example rewritten to do this. A ValueRequest doesn't stop anything or force anything to close, so it wont ever interfere, confuse, or complicate, code that uses other exceptions. You can always throw or catch one of these and raise something else if you need to.
Since throwing it into a generator doesn't stop the generator, the generator can put the try-except into a larger loop and loop back to get more values and catch another ValueRequest at some later point. I feel that is a useful and handy thing to do.
So here's the example again.
The first version of this took advantage of yield's ability to send and get data at the same time to always send back an update (subtotal) to the parent routine. That's nearly free since a yield always sends something back anyway. (None if you don't give it something else.) But it's not always easy to do, or easy to understand if you do it. IE.. brain exploding stuff.
In this version, data only flows into the coroutine until a ValueRequest exception is thrown at it, at which point it then yields back a total.
*I can see where some routines may reverse the control, by throwing ValueReturns from the inside out, rather than ValueRequests from the outside in. Is it useful to distinquish between the two or should there be just one?
*Yes this can be made to work with gclose() and return, but I feel that is more restrictive, and more complex, than it needs to be.
*I still didn't figure out how to use the context managers to get rid of the try except. Oh well. ;-)
from contextlib import contextmanager
class ValueRequest(Exception): pass
@contextmanager def consumer(cofunc, result=True): next(cofunc) try: yield cofunc finally: cofunc.close()
@contextmanager def multiconsumer(cofuncs, result=True): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: c.close()
# Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators.
def reduce_item(f): try: x = yield while True: x = f(x, (yield)) except ValueRequest: yield x
def reduce_group(funcs): with multiconsumer([reduce_item(f) for f in funcs]) as mc: try: while True: x = yield for c in mc: c.send(x) except ValueRequest: yield [c.throw(ValueRequest) for c in mc]
def get_reductions(funcs, iterable): with consumer(reduce_group(funcs)) as c: for x in iterable: c.send(x) return c.throw(ValueRequest)
def main(): funcs = [min, max] print(get_reductions(funcs, range(100))) s = "Python is fun for play, and great for work too." print(get_reductions(funcs, s))
if __name__ == '__main__': main()
Hm... Certainly interesting. My own (equally anti-climactic :-) conclusions would be: - Tastes differ - There is a point where yield gets overused - I am not convinced that using reduce as a paradigm here is right -- --Guido van Rossum (python.org/~guido)
On 10/27/2010 09:53 PM, Guido van Rossum wrote:
Hm... Certainly interesting. My own (equally anti-climactic :-) conclusions would be:
- Tastes differ
- There is a point where yield gets overused
- I am not convinced that using reduce as a paradigm here is right
I Agree. :-) This was a contrived example for the purpose of testing an idea. The concept being tested had nothing to do with reduce. It had to do with the interface and control mechanisms. Cheers, Ron
On 2010-10-28 01:46, Guido van Rossum wrote:
Nick & Jacob,
Unfortunately other things are in need of my attention and I am quickly lagging behind on this thread.
Too bad, but understandable. I'll try to be brief(er).
I'll try to respond to some issues without specific quoting.
If GeneratorReturn and finish() can be implemented in pure user code, then I think it should be up to every (framework) developer to provide their own API, using whatever constraints they chose. Without specific use cases it's hard to reason about API design.
GeneratorReturn and finish *can* be implemented in pure user code, as long as you accept that the premature return has to use some other mechanism than "return" or StopIteration.
Still, I think it is reasonable to offer some basic behavior on the generator object, and I still think that the best compromise here is to let g.close() extract the return value from StopIteration if it catches it. If a framework decides not to use this, fine. For a user working without a framework this is still just a little nicer than having to figure out the required logic yourself.
This works only as long as you don't actually use yield-from, making it a bit of a strange match to that PEP. To get it to work *with* yield-from you need the reraised GeneratorExit to include the return value (possibly None) from the inner generator. I seem to have convinced Nick that the problem is real and that a modification to the expansion might be needed/desirable.
I am aware of four relevant states for generators. Here's how they work (in current Python):
- initial state: execution is poised at the top of the function. g.throw() always bounces back the exception. g.close() moves it to the final state. g.next() starts it running. g.send() requires a None argument and is then the same as g.next().
- running state: the frame is active. none of g.next(), g.send(), g.throw() or g.close() work -- they all raise ValueError.
- suspended state: execution is suspended at a yield. g.close() raises GeneratorExit and if the generator catches this it can do whatever it pleases. If it then raises StopIteration or GeneratorExit, g.close() is happy, if it raises another exception g.close() just passes that through, if it yields a value g.close() complains and raises RuntimeError().
- finished (exhausted) state: the generator has returned. g.close() always return None. g.throw() always bounces back the exception. g.next() and g.send() always raise StopIteration.
I would be in favor of adding an introspection API to distinguish these four states and I think it would be a fine thing to add to Python 3.2 if anyone finds the time to produce a patch (Nick? You showed what these boil down to.)
I note that in the initial state a generator has no choice in how to respond because it hasnt't yet had the opportunity to set up a try/except, so in this state it acts pretty much the same as in the exhausted state when receiving a throw() or close().
Yes, I forgot about this case in the versions of "finish" that I wrote. Nick showed a better version that handled it properly.
Regarding built-in syntax for Go-like channels, let's first see an implementation in userland become successful *or* see that it's impossible to write an efficient one before adding more to the language.
It is impossible in current python to use a for-loop or generator expression to loop over a Go-like channel without using threads for everything. (The only way to suspend the iteration is to suspend the thread, and then whatever code is supposed to write to the channel must be running in another thread) This is a shame, since the blocking nature of channels otherwise make them ideal for cooperative multitasking. Note, this restriction (no for-loop iteration without threads) does not make channels useless in current python, just much less convenient to work with. That, unfortunately, makes it less likely that a userland implementation will ever become successful.
Note that having a different expansion of a for-loop based on the run-time value or type of the iterable cannot be done -- the expansion can only vary based on the syntactic form.
The intent was to have a different expansion depending on the type of function containing the for-loop (as in regular/cofunction). I think I made a few errors though, so the new expansion doesn't actually work with regular iterables. If I get around to fixing it I'll post the fix in that thread.
There are a few different conventions for using generators and yield-from; e.g. generators used as proper iterators with easy refactoring; generators used as tasks where yield X is used for blocking I/O operations; and generators used as "inverse generators" as in the parallel_reduce() example that initiated this thread. I don't particularly care about what kind of errors you get if a generator written for one convention is accidentally used by another convention, as long as it is made clear which convention is being used in each case. Frameworks/libraries can and probably should develop decorators to mark up the 2nd and 3rd conventions, but I don't think the *language* needs to go out of its way to enforce proper usage.
Agreed, I think. - Jacob
On 2010-10-28 00:52, Nick Coghlan wrote:
On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm
wrote: Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications.
Ah, you're quite right - I'd completely forgotten about the GeneratorExit special-casing in the PEP 380 semantics, so I was arguing from a faulty premise. With that error corrected, I can happily withdraw my objection to idioms that convert GeneratorExit to StopIteration (since any yield from expressions will reraise the GeneratorExit in that case).
Looks like we are still not on exactly the same page though... You seem to be arguing from the version at http://www.python.org/dev/peps/pep-0380, whereas I am looking at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/..., which is newer.
The "did-it-really-finish?" question can likely be answered by slightly improving generator state introspection from the Python level (as I believe Guido suggested earlier in the thread). That way close() can keep the gist of its current semantics (return something if the generator ends up in an inactive state, raise RuntimeError if it yields another value), while frameworks can object to other unexpected states if they want to.
As it turns out, the information on generator state is already there, just not in a particularly user friendly format ("not started" = "g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" = "g.gi_frame is None").
So, without any modifications at all to the current incarnation of PEP 380, it is already possible to write:
def finish(gen): frame = gen.gi_frame if frame is None: raise RuntimeError('finish() on exhausted/closed generator') if frame.f_lasti == -1: raise RuntimeError('finish() on not yet started generator') try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] return None except GeneratorExit: pass else: raise RuntimeError('Generator ignored GeneratorExit') raise RuntimeError('Generator failed to return a value')
Yes. I forgot about the "not yet started" case in my earlier versions.
I think I'm finally starting to understand *your* question/concern though. Given the current PEP 380 expansion, the above definition of finish() and the following two generators:
def g_inner(): yield return "Hello world!"
def g_outer(): yield (yield from g_inner())
You would get the following result (as g_inner converts GeneratorExit to StopIteration, then yield from propogates that up the stack):
g = g_outer() next(g) finish(g) "Hello world!"
Oops?
Well. Not with the newest expansion. Not that the None you will get from that one is any better.
I'm wondering if this part of the PEP 380 expansion: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise
Should actually look like: if _e is _x[1]: raise if isinstance(_x[1], GeneratorExit): raise GeneratorExit(*_e.args)
In the newer expansion, I would change: except GeneratorExit as _e: try: _m = getattr(_i, 'close') except AttributeError: pass else: _m() raise _e Into: except GeneratorExit as _e: try: _m = getattr(_i, 'close') except AttributeError: pass else: raise GeneratorExit(_m()) raise _e (Which can cleaned up a bit btw., by removing _e and using direct attribute access instead of getattr)
Once that distinction is made, you can more easily write helper functions and context managers that allow code to do the "right thing" according to the needs of a particular framework or application.
Yes. OTOH, I have argued for this change before with no luck. - Jacob
On 2010-10-27 18:53, Jacob Holm wrote:
Hmm. This got me thinking. One thing I'd really like to see in python is something like the "channel" object from the go language (http://golang.org/).
Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without any of them) it is possible to write a trampoline-based implementation of a channel object with "send" and "next" methods that work as expected. One thing that is *not* possible (I think) is to make that object iterable. Your wild idea above gave me a similar wild idea of my own. An extension to the cofunctions PEP that would make that possible.
Seems like I screwed up the semantics of the standard for-loop in that version. Let me try again... 1) Add new exception StopCoIteration, inheriting from StandardError. Change the regular StopIteration to inherit from the new exception instead of directly from StandardError. This ensures that code that catches StopCoIteration also catches StopIteration, which I think is what we want. The new exception is needed because "cocall func()" can never raise the regular StopIteration (or any subclass thereof). This might actually be an argument for using a different exception for returning a value from a coroutine... 2) Allow __next__ on an object to be a cofunction. Add a __cocall__ to the built-in next(ob) that tries to uses cocall to call ob.__next__. def next__cocall__(ob, *args): if len(args)>1: raise TypeError try: _next = type(ob).__next__ except AttributeError: raise TypeError try: return cocall _next(ob) except StopCoIteration: if args: return args[0] raise 2a) Optionally allow __iter__ on an object to be a cofunction. Add a __cocall__ to the builtin iter. class _func_iter(object): def __init__(self, callable, sentinel): self.callable = callable self.sentinel = sentinel def __next__(self): v = cocall self.callable() if v is sentinel: raise StopCoIteration return v def iter__cocall__(*args): try: ob, = args except ValueError: try: callable, sentinel = args except ValueError: raise TypeError return _func_iter(callable, sentinel) try: _iter = type(ob).__iter__ except AttributeError: raise TypeError return cocall _iter(ob) 3) Change the for-loop in a cofunction: for val in iterable: <block> else: <block> so it expands into: _it = cocall iter(iterable) while True: try: val = cocall next(iterable) except StopCoIteration: break <block> else: <block> which is exactly the normal expansion, but using cocall to call iter and next, and catching StopCoIteration instead of StopIteration. Since cocall falls back to using a regular call, this should work well with all normal iterables. 3a) Alternatively define a new syntax for "coiterating", e.g. cocall for val in iterable: <block> else: <block> All this to make it possible to write a code like this: def consumer(ch): cocall for val in ch: print(val) def producer(ch): cocall for val in range(10): cocall ch.send(val) def main() sched = scheduler() ch = channel() sched.add(consumer(ch)) sched.add(producer(ch)) sched.run() Thoughts? - Jacob
On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm
On 2010-10-28 00:52, Nick Coghlan wrote:
On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm
wrote: Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications.
Ah, you're quite right - I'd completely forgotten about the GeneratorExit special-casing in the PEP 380 semantics, so I was arguing from a faulty premise. With that error corrected, I can happily withdraw my objection to idioms that convert GeneratorExit to StopIteration (since any yield from expressions will reraise the GeneratorExit in that case).
Looks like we are still not on exactly the same page though... You seem to be arguing from the version at http://www.python.org/dev/peps/pep-0380, whereas I am looking at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/..., which is newer.
Ah, the comment earlier in the thread about the PEP not being up to date with the last discussion makes more sense now... Still, the revised expansion also does the right thing in the case that was originally bothering me, and I agree with your suggested tweak to that version. I've cc'ed Greg directly on this email - if he wants, I can check in an updated version of the PEP to bring the python.org version up to speed with the later discussions. With that small change to the yield from expansion, as well as the change to close to return the first argument to StopIteration (if any) and None otherwise, I think PEP 380 will be in a much better position to support user experimentation in this area. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Oct 28, 2010 at 9:46 AM, Guido van Rossum
Nick & Jacob,
Unfortunately other things are in need of my attention and I am quickly lagging behind on this thread.
I'll try to respond to some issues without specific quoting.
If GeneratorReturn and finish() can be implemented in pure user code, then I think it should be up to every (framework) developer to provide their own API, using whatever constraints they chose. Without specific use cases it's hard to reason about API design. Still, I think it is reasonable to offer some basic behavior on the generator object, and I still think that the best compromise here is to let g.close() extract the return value from StopIteration if it catches it. If a framework decides not to use this, fine. For a user working without a framework this is still just a little nicer than having to figure out the required logic yourself.
Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls, and I've cc'ed Greg directly on the relevant message in order to move that idea forward (and bring the python.org version of the PEP up to date with the last posted version as well). That should provide a solid foundation for experimentation in user code in 3.3 without overcomplicating PEP 380 with stuff that will probably end up being YAGNI.
I would be in favor of adding an introspection API to distinguish these four states and I think it would be a fine thing to add to Python 3.2 if anyone finds the time to produce a patch (Nick? You showed what these boil down to.)
I've created a tracker issue proposing a simple inspect.getgeneratorstate() function (issue 10220). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan
Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls,
This is my "gclose()" function, right? Or is there more to it?
and I've cc'ed Greg directly on the relevant message in order to move that idea forward (and bring the python.org version of the PEP up to date with the last posted version as well).
Greg's been remarkably quiet on this thread even though I cc'ed him early on. Have you heard back from him yet?
That should provide a solid foundation for experimentation in user code in 3.3 without overcomplicating PEP 380 with stuff that will probably end up being YAGNI.
I would be in favor of adding an introspection API to distinguish these four states and I think it would be a fine thing to add to Python 3.2 if anyone finds the time to produce a patch (Nick? You showed what these boil down to.)
I've created a tracker issue proposing a simple inspect.getgeneratorstate() function (issue 10220).
I added a little something to the issue. -- --Guido van Rossum (python.org/~guido)
Jacob Holm wrote:
1) Define a new "coiterator" protocol, consisting of a new special method __conext__, and a new StopCoIteration exception that the regular StopIteration inherits from.
I don't think it's necessary to have a new protocol. All that's needed is to allow for the possibility of the __next__ method of an iterator being a cofunction. Under the current version of PEP 3152, with an explicit "cocall" operation, this would require a new kind of for-loop. Maybe using "cofor"? However, my current thinking on cofunctions is that cocalls should be implicit -- you declare a cofunction with "codef", and any call made within it can potentially be a cocall. In that case, there would be no need for new syntax -- the existing for-loop could just do the right thing when given an object whose __next__ method is a cofunction. Thinking about this has made me even more sure that implicit cocalls are the way to go, because it means that any other things we think of that need to take cofunctions into account can be fixed without having to introduce new syntax for each one. -- Greg
Jacob Holm wrote:
The new exception is needed because "cocall func()" can never raise the regular StopIteration (or any subclass thereof).
Botheration, I hadn't thought of that! I'll have to think about this one. I still feel that it shouldn't be necessary to define any new protocol -- one ought to be able to simply write a __next__ cofunction that looks like a normal one in all respects except that it's defined with 'codef'. Maybe a StopIteration raised inside a cofunction shouldn't be synonymous with a return, but instead should be caught and tunnelled around the yield-from via another exception. -- Greg
Nick Coghlan wrote:
On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm
wrote:
Looks like we are still not on exactly the same page though... You seem to be arguing from the version at http://www.python.org/dev/peps/pep-0380, whereas I am looking at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/..., which is newer.
Still, the revised expansion also does the right thing in the case that was originally bothering me,
That attachment is slightly older than my own current draft,
which is attached below. The differences in the expansion are
as follows (- is the version linked to above, + is my current
version):
@@ -141,20 +141,21 @@
_s = yield _y
except GeneratorExit as _e:
try:
- _m = getattr(_i, 'close')
+ _m = _i.close
except AttributeError:
pass
else:
_m()
raise _e
except BaseException as _e:
+ _x = sys.exc_info()
try:
- _m = getattr(_i, 'throw')
+ _m = _i.throw
except AttributeError:
raise _e
else:
try:
- _y = _m(*sys.exc_info())
+ _y = _m(*_x)
except StopIteration as _e:
_r = _e.value
break
Does this version still address your concerns? If so, please
check it in as the latest version.
--
Greg
PEP: XXX
Title: Syntax for Delegating to a Subgenerator
Version: $Revision$
Last-Modified: $Date$
Author: Gregory Ewing
On 2010-10-28 22:14, Greg Ewing wrote:
Jacob Holm wrote:
1) Define a new "coiterator" protocol, consisting of a new special method __conext__, and a new StopCoIteration exception that the regular StopIteration inherits from.
I don't think it's necessary to have a new protocol. All that's needed is to allow for the possibility of the __next__ method of an iterator being a cofunction.
That is more or less exactly what I did for my second version. Turns out to be less simple than that because you need to "next" work as a cofunction as well, and there is a problem with raising StopIteration from a cofunction.
Under the current version of PEP 3152, with an explicit "cocall" operation, this would require a new kind of for-loop. Maybe using "cofor"?
However, my current thinking on cofunctions is that cocalls should be implicit -- you declare a cofunction with "codef", and any call made within it can potentially be a cocall. In that case, there would be no need for new syntax -- the existing for-loop could just do the right thing when given an object whose __next__ method is a cofunction.
Thinking about this has made me even more sure that implicit cocalls are the way to go, because it means that any other things we think of that need to take cofunctions into account can be fixed without having to introduce new syntax for each one.
Yes. Looking at a few examples using my toy implementation of Go channels made me realise just how awkward it would be to have to mark all cocall sites explicitly. With implicit cocalls and a for-loop changed to work with a cofunction __next__, working with channels can be made to look exactly like working with generators. For me, that would be a major selling point for the PEP. - Jacob
On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum
On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan
wrote: Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls,
This is my "gclose()" function, right? Or is there more to it?
Yeah, the idea your gclose(), plus one extra tweak to the expansion of "yield from" to store the result of the inner close() call on a new GeneratorExit instance. To use a toy example: # Even this toy framework needs a little structure class EndSum(Exception): pass def gsum(): # Sums sent values until EndSum or GeneratorExit are thrown in tally = 0 try: while 1: tally += yield except (EndSum, GeneratorExit): pass return x def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum()) except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan
On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum
wrote: On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan
wrote: Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls,
This is my "gclose()" function, right? Or is there more to it?
Yeah, the idea your gclose(), plus one extra tweak to the expansion of "yield from" to store the result of the inner close() call on a new GeneratorExit instance.
To use a toy example:
# Even this toy framework needs a little structure class EndSum(Exception): pass
def gsum(): # Sums sent values until EndSum or GeneratorExit are thrown in tally = 0 try: while 1: tally += yield except (EndSum, GeneratorExit): pass return x
You meant return tally. Right?
def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum()) except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums)
Hmmm... That looks pretty complicated. Wouldn't it be much more straightforward if instead of value ... value EndSum value ... value EndSum value ... value GeneratorExit the input sequence was required to be value ... value EndSum value ... value EndSum value ... value *EndSum* GeneratorExit ? Then gsum() wouldn't have to catch EndSum at all, and I don't think the PEP would have to special-case GeneratorExit. average_sums() could simply have except GeneratorExit: return sum(sums) / len(sums) After all this is a fairly arbitrary protocol and the caller presumably can do whatever is required of it. If there are values between the last EndSum and the last GeneratorExit those will be ignored -- that is a case of garbage in garbage out. If you really wanted to catch that mistake there would be several ways to translate it reliably into some other exception -- or log it, or whatever. It is also defensible that a better design of the protocol would not require throwing EndSum but sending some agreed-upon marker value. -- --Guido van Rossum (python.org/~guido)
On Thu, Oct 28, 2010 at 4:21 PM, Guido van Rossum
On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan
wrote: To use a toy example:
# Even this toy framework needs a little structure class EndSum(Exception): pass
def gsum(): # Sums sent values until EndSum or GeneratorExit are thrown in tally = 0 try: while 1: tally += yield except (EndSum, GeneratorExit): pass return x
You meant return tally. Right?
def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum()) except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums)
This toy example is a little confusing to me because it has typos… which is natural when one is writing a program without being able to run it to debug it. So, I wrote a version of the accumulator/averager that will work in Python 2.7 (and I think 3, but I didn't test it): class ReturnValue(Exception): pass def prime_pump(gen): def f(*args, **kwargs): g = gen(*args, **kwargs) next(g) return g return f @prime_pump def accumulator(): total = 0 length = 0 try: while 1: value = yield total += value length += 1 print(length, value, total) except GeneratorExit: r = ReturnValue() r.total = total r.length = length raise r @contextmanager def get_sum(it): try: it.close() except ReturnValue as r: yield r.total @contextmanager def get_average(it): try: it.close() except ReturnValue as r: yield r.total / r.length def main(): running_total = accumulator() sums = accumulator() running_total.send(6) #For example, whatever running_total.send(7) with get_sum(running_total) as first_sum: sums.send(first_sum) running_total = accumulator() #Zero it out running_total.send(2) #For example, whatever running_total.send(2) running_total.send(5) running_total.send(8) with get_sum(running_total) as second_sum: sums.send(second_sum) #Get the average of the sums with get_average(sums) as r: return r main() So, I guess the question I have is how will the proposed extensions to the language make the above code prettier? One thing I can see is that if it's possible to return from inside a generator, it can be more straightforward to get the values out of the accumulator at the end: try: while 1: value = yield total += value length += 1 print(length, value, total) except GeneratorExit: return total, length With Guido's proposed "for item from yield" syntax, IIUC this can be prettied up even more as: for value from yield: total += value length += 1 return total, length Are there other benefits to the proposed extensions? How will the call sites be improved? I'm not sure how I would rewrite main() to be prettier/more clear in light of the proposals… Thanks, -- Carl Johnson
On 10/28/2010 08:17 PM, Nick Coghlan wrote:
On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum
wrote: On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan
wrote: Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls,
This is my "gclose()" function, right? Or is there more to it?
Yeah, the idea your gclose(), plus one extra tweak to the expansion of "yield from" to store the result of the inner close() call on a new GeneratorExit instance.
To use a toy example:
# Even this toy framework needs a little structure class EndSum(Exception): pass
def gsum(): # Sums sent values until EndSum or GeneratorExit are thrown in tally = 0 try: while 1: tally += yield except (EndSum, GeneratorExit): pass return x
def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum()) except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums)
Nick, could you add a main() or calling routine? I'm having trouble seeing the complete logic without that. Cheers, Ron
On 29/10/10 15:21, Guido van Rossum wrote:
value ... value EndSum value ... value EndSum value ... value *EndSum* GeneratorExit
Seems to me that anything requiring asking for intermediate values while not stopping the computation entirely is going beyond what can reasonably be supported with a generator. I wouldn't like to see yield-from and/or the generator protocol made any more complicated in order to allow such things. -- Greg
On 10/28/2010 08:17 PM, Nick Coghlan wrote:
def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum())
Wouldn't this need to be... gsum_ = gsum() next(gsum_) sums.append(yield from gsum_) Or does the yield from allow send on a just started generator?
except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums)
Ron
On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm
I had some later suggestions for how to change the expansion, see e.g. http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I find that version easier to reason about even now 1½ years later)
I like that style too. Here it is with some annotations. _i = iter(EXPR) _m, _a = next, (_i,) # _m is a function or a bound method; # _a is a tuple of arguments to call _m with; # both are set to other values further down while 1: # Move the generator along try: _y = _m(*_a) except StopIteration as _e: _r = _e.value break # Yield _y and process what came back in try: _s = yield _y except GeneratorExit as _e: # Request to exit try: # NOTE: This _m is unrelated to the other _m = _i.close except AttributeError: pass else: _m() raise _e # Always exit except BaseException as _e: # An exception was thrown in; pass it along _a = sys.exc_info() try: _m = _i.throw except AttributeError: # Can't throw it in; throw it back out raise _e else: # A value was sent in; pass it along if _s is None: _m, _a = next, (_i,) else: _m, _a = _i.send, (_s,) RESULT = _r I do note that this is a bit subtle; I don't like the reusing of _m and it's hard to verify that _m and _a are set on every path that goes back to the top of the loop. -- --Guido van Rossum (python.org/~guido)
On 10/27/2010 11:53 AM, Jacob Holm wrote:
On 2010-10-26 18:56, Guido van Rossum wrote:
Now, if I may temporarily go into wild-and-crazy mode (this *is* python-ideas after all :-), we could invent some ad-hoc syntax for this pattern, e.g.:
Hmm. This got me thinking. One thing I'd really like to see in python is something like the "channel" object from the go language (http://golang.org/).
Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without any of them) it is possible to write a trampoline-based implementation of a channel object with "send" and "next" methods that work as expected. One thing that is *not* possible (I think) is to make that object iterable. Your wild idea above gave me a similar wild idea of my own. An extension to the cofunctions PEP that would make that possible.
[clipped]
All this to make it possible to write a code like this:
def consumer(ch): for val in ch: cocall print(val) # XXX need a cocall somewhere
def producer(ch): for val in range(10): cocall ch.send(val)
def main() sched = scheduler() ch = channel() sched.add(consumer(ch)) sched.add(producer(ch)) sched.run()
Thoughts?
The following isn't quite the same as channels, but you might be able to use the technique below to get something like it. These examples show how linking generators using nonlocal can simplify some problems. The nice thing about these is the inner loops are very simple and there is no try-except blocks, exceptions, or if statements involved. ;-) Ron def averager(): def collector(): nonlocal tally, count while 1: tally += yield count += 1 def emitter(): nonlocal tally, count while 1: yield tally / count count = tally = 0 coll = collector() next(coll) return coll, emitter() coll, emit = averager() for x in range(100): coll.send(x) print('average: %s' % next(emit)) def parallel_reduce(iterable, funcs): def reduce_collector(func): def collector(func): nonlocal outcome outcome = yield while 1: outcome = func(outcome, (yield)) def emitter(): nonlocal outcome while 1: yield outcome outcome = None coll = collector(func) next(coll) return coll, emitter() collectors = [reduce_collector(func) for func in funcs] for val in iterable: for coll, _ in collectors: coll.send(val) return [next(emit) for _, emit in collectors] print(parallel_reduce(range(5, 10), [min, max]))
participants (7)
-
Carl M. Johnson
-
Greg Ewing
-
Guido van Rossum
-
Jacob Holm
-
Nick Coghlan
-
Peter Otten
-
Ron Adam