Odd closure issue for generators

Thu Jun 4 19:02:35 EDT 2009

En Thu, 04 Jun 2009 18:40:07 -0300, Brian Quinlan <brian at sweetapp.com>  
escribió:

> This is from Python built from the py3k branch:

It's not new; same thing happens with 2.x

A closure captures (part of) the enclosing namespace, so names are  
resolved in that environment even after the enclosing block has finished  
execution.
As always, the name->value evaluation happens when it is required at  
runtime, not earlier ("late binding"). So, in the generator expression  
(lambda : i for i in range(11, 16)) the 'i' is searched in the enclosing  
namespace when the lambda is evaluated, not when the lambda is created.

Your first example:

> [1] >>> c = (lambda : i for i in range(11, 16))
> [2] >>> for q in c:
> [3] ... 	print(q())
>     ...
> 11
> 12
> 13
> 14
> 15
>  >>> # This is expected

This code means:

[1] create a generator expression and name it `c`
[2] ask `c` for an iterator. A generator is its own iterator, so returns  
itself.
Loop begins: Ask the iterator a new item (the first one, in fact). So the  
generator executes one cycle, yields a lambda expression, and freezes.  
Note that the current value of `i` (in that frozen environment) is 11.  
Name the yielded object (the lambda) `q`.
[3] Call the q object. That means, execute the lambda. It returns the  
current value of i, 11. Print it.
Back to [2]: ask the iterator a new item. The generator resumes, executes  
another cycle, yields another lambda expression, and freezes. Now, i is 12  
inside the frozen environment.
[3] execute the lambda -> 12
etc.

Your second example:

> [4] >>> c = (lambda : i for i in range(11, 16))
> [5] >>> d = list(c)
> [6] >>> for q in d:
> [7] ... 	print(q())
>     ...
> 15
> 15
> 15
> 15
> 15
>  >>> # I was very surprised

[4] creates a generator expression same as above.
[5] ask for an iterator (c itself). Do the iteration NOW until exhaustion,  
and collect each yielded object into a list. Those objects will be  
lambdas. The current (and final) value of i is 15, because the range()  
iteration has finished.
[6] iterate over the list...
[7] ...and execute each lambda. At this time, `i` is always 15.

> Looking at the implementation, I see why this happens:
>  >>> c = (lambda : i for i in range(11, 16))
>  >>> for q in c:
> ... 	print(id(q.__closure__[0]))
> ...
> 3847792
> 3847792
> 3847792
> 3847792
> 3847792
>  >>> # The same closure is used by every lambda

...because all of them refer to the same `i` name.

> But it seems very odd to me and it can lead to some problems that are a  
> real pain in the ass to debug.

Yes, at least if one is not aware of the consequences. I think this (or a  
simpler example) should be explained in the FAQ. The question comes in  
this list again and again...

-- 
Gabriel Genellina