[Python-Dev] Possible resolution of generator expression variable capture dilemma

Phillip J. Eby pje at telecommunity.com
Wed Mar 24 10:37:37 EST 2004


At 05:37 PM 3/24/04 +1200, Greg Ewing wrote:
>An advantage of this approach is that *all* forms of nested scope
>(lambda, def, etc.) would benefit, not just generator expressions. I
>suspect it would eradicate most of the few remaining uses for the
>default-argument hack, for instance (which nested scopes were supposed
>to do, but didn't).

Wow.  I haven't spent a lot of time thinking through for possible holes, 
but my initial reaction to this is that it'd be a big plus.  It's always 
counterintuitive to me that I can't define a function or lambda expression 
in a for loop without having to first create a function that returns a 
function.

But then...  what about while loops?  I think it'd be confusing if I 
changed between a for and a while loop (either direction), and the 
semantics of nested function definitions changed.  Indeed, I think you'd 
need to make this work for *all* variables rebound inside *all* loops that 
contain a nested scope, not just a for loop's index variable.

Would this produce any incompatibilities?  Right now, if you define a 
function inside a loop, intending to call it from outside the loop, your 
code doesn't work in any sane way today.  If you define one that's to be 
called from inside the loop, it will work the same way...  unless you're 
rebinding variables after the definition, but before the call point.

So, it does seem that there could be some code that would change its 
semantics.  Such code would have to define a function inside a loop, and 
reference a variable that is rebound inside the loop, but after the 
function is defined.  E.g.:

for x in 1,2,3:
     def y():
         print z
     z = x * 2
     y()

Today, this code prints 2, 4, and 6, but under the proposed approach it 
would presumably get an unbound local error.

So, I think the trick here would be figuring out how to specify this in 
such a way that it both makes sense for its intended use, while not fouling 
up code that works today.  Reallocating cells at the top of the loop might 
work:

for x in 1,2,3:
     def y(): print z
     z = x * 2
     def q(): print z
     z = x * 3
     y()
     q()

This code will now print 3,3,6,6,9,9, and would do the same under the 
proposed approach.  What *doesn't* work is invoking a previous definition 
after modifying a local:

for x in 1,2,3:
     z = x * 3
     if x>1:
         y()
     def y():
         print z
     z = x * 2

Today, this prints 6,9, but under the proposed semantics it would print 4,6.

Admittedly, I am hard-pressed to imagine an actual use case for this 
pattern of code execution, but if anybody's done it, their code would break.

Unfortunately, this means that even your comparatively modest proposal 
(only 'for' loops, and only the index variable) can have this same issue, 
if the loop index variable is being rebound.  This latter pattern (define 
in one iteration, invoke in a later one) will change its meaning under such 
a capture scheme.




More information about the Python-Dev mailing list