(BTW, I'm trying to update the PEP with a discussion of thunks.) [Guido]
The main advantage of thunks that I can see is that you can save the thunk for later, like a callback for a button widget (the thunk then becomes a closure).
[Greg]
Or pass it on to another function. This is something we haven't considered -- what if one resource-acquision- generator (RAG?) wants to delegate to another RAG?
With normal generators, one can always use the pattern
for x in sub_generator(some_args): yield x
But that clearly isn't going to work if the generators involved are RAGs, because the exceptions passed in are going to be raised at the point of the yield in the outer RAG, and the inner RAG isn't going to get finalized (assuming the for-loop doesn't participate in the finalization protocol).
To get the finalization right, the inner generator needs to be invoked as a RAG, too:
block sub_generator(some_args): yield
But PEP 340 doesn't say what happens when the block contains a yield.
The same as when a for-loop contains a yield. The sub_generator is entirely unaware of this yield, since the local control flow doesn't actually leave the block (i.e., it's not like a break, continue or return statement). When the loop that was resumed by the yield calls next(), the block is resumed back after the yield. The generator finalization semantics guarantee (within the limitations of all finalization semantics) that the block will be resumed eventually. I'll add this to the PEP, too. I'd say that a yield in a thunk would be more troublesome: does it turn the thunk into a generator or the containing function? It would have to be the thunk, but then things get weird quickly (the caller of the thunk has to treat the result of the call as an iterator).
A thunk implementation wouldn't have any problem with this, since the thunk can be passed down any number of levels before being called, and any exceptions raised in it will be propagated back up through all of them.
The other problem with thunks is that once we think of them as the anonymous functions they are, we're pretty much forced to say that a return statement in a thunk returns from the thunk rather than from the containing function.
Urg, you're right. Unless return is turned into an exception in that case. And then I suppose break and return (and yield?) will have to follow suit.
But wasn't that exactly what you were trying to avoid? :-)
I'm just trying to think how Smalltalk handles this, since it must have a similar problem, but I can't remember the details.
every local variable used or set in the thunk would have to become a 'cell' . Cells slow down access somewhat compared to regular local variables.
True, but is the difference all that great? It's just one more C-level indirection, isn't it?
Alas not. It becomes a call to PyCell_Set() or PyCell_Get().
we'll want variables *assigned to* by the thunk also to be shared with the containing function,
Agreed. We'd need to add a STORE_CELL bytecode or something for this.
This actually exists -- it is used for when an outer function stores into a local that it shares with an inner function. -- --Guido van Rossum (home page: http://www.python.org/~guido/)