[Python-ideas] free variables in generator expressions

Jan Kanis jan.kanis at phil.uu.nl
Sat Dec 15 23:41:12 CET 2007


On Thu, 13 Dec 2007 08:08:53 +0100, Arnaud Delobelle  
<arno at marooned.org.uk> wrote:

>
> On 12 Dec 2007, at 23:41, Georg Brandl wrote:
>
>> Arnaud Delobelle schrieb:
>>
>>> Let's test this (python 2.5):
>>>
>>> >>> A = '12'
>>> >>> B = 'ab'
>>> >>> gen = (x + y for x in A for y in B)
>>> >>> A = '34'
>>> >>> B = 'cd'
>>> >>> list(gen)
>>>  ['1c', '1d', '2c', '2d']
>>>
>>> So in the generator expression, A is remains bound to the string '12'
>>> but B gets rebound to 'cd'.  This may make the implementation of
>>> generator expressions more straighforward, but from the point of view
>>> of a user of the language it seems rather arbitrary. What makes A so
>>> special as opposed to B?  Ok it belongs to the outermost loop, but
>>> conceptually in the example above there is no outermost loop.
>>
>> Well, B might depend on A so it can't be evaluated in the outer
>> context
>> at the time the genexp "function" is called. It has to be evaluated
>> inside the "function".
>
> You're right. I expressed myself badly: I was not talking about
> evaluation but binding.  I was saying that if the name A is bound to
> the object that A is bound to when the generator expression is
> created, then the same should happen with B.
>

I think what Georg meant was this (I intended to reply this to your  
earlier mail of Thursday AM, but Georg beat me to it):

The reason for not binding B when the genexp is defined is so you can do  
this:

  >>> A = [[1, 2], [3, 4]]
  >>> gen = (x for b in A for x in b)
  >>> list(gen)
  [1, 2, 3, 4]

Here, b can't be bound to something at generator definition time because  
the 'something' may not exist yet. (It does actually in this example, but  
you get the point.) So, only the first (outer loop) iterable is bound  
immediately.

Whether a variable is rebound within the expression could of course be  
decided at compile time, so all free variables could be bound immediately.  
I think that would be an improvement, but it requires the compiler to be a  
bit smarter. Unfortunately, it seems to be pythonic to bind variables at  
moments I disagree with :), like function default arguments (bound at  
definition instead of call) and loop counters (rebound every iteration  
instead of every iteration having it's own scope).

And, while I'm writing this:

On Thu, 13 Dec 2007 00:01:42 +0100, Arnaud Delobelle  
<arno at marooned.org.uk> wrote:
> l = [f(x, y) for x in A for y in B(x) if g(x, y)]
> g = [f(x, y) for x in A for y in B(x) if g(x, y)]
> <code, maybe binding A, B, f, g to new objects>
> assert list(g) == l

I suppose this should have been

g = (f(x, y) for x in A for y in B(x) if g(x, y))


Jan



More information about the Python-ideas mailing list