Nested generator caveat

Mark Wooding mdw at distorted.org.uk
Sun Jul 6 18:03:02 CEST 2008


Dieter Maurer <dieter at handshake.de> wrote:

> I met the following surprising behaviour

[code moved until later...]

> The apparent reason is that the free variables in nested generator
> definitions are not bound (to a value) at invocation time but only at
> access time.

No.  This is about the difference between binding and assignment.
Unfortunately, Python doesn't have explicit syntax for doing the former.

Here's what's actually going on in your generator.

>>>> def gen0():
> ...   for i in range(3):
> ...     def gen1():
> ...       yield i
> ...     yield i, gen1()

The function gen0 contains a yield statement; it's therefore a
generator.  It contains an assignment to a variable i (in this case,
it's implicit in the `for' loop).  So, on entry to the code, a fresh
location is allocated, and the variable i is bound to it.

The function gen1 contains a yield statement too, so it's also a
generator.  It contains a free reference to a variable i, so it shares
the binding in the outer scope.

Here's the important part: the for loop works by assigning to the
location named by i each time through.  It doesn't rebind i to a fresh
location.  So each time you kick gen1, it produces the current value of
i at that time.  So...

>>>> for i,g in gen0(): print i, g.next()
> 0 0
> 1 1
> 2 2

Here, the for loop in gen0 is suspended each iteration while we do some
printing.  So the variable i (in gen0) still matches the value yielded
by gen0.

But...

>>>> for i,g in list(gen0()): print i, g.next()
> 0 2
> 1 2
> 2 2

Here, gen0 has finished all of its iterations before we start kicking
any of the returned generators.  So the value of i in gen0 is 2 (the
last element of range(3)).

> Almost surely, the same applies to all locally defined functions
> with free variables.
> This would mean that locally defined functions with free
> variables are very risky in generators.

It means that you must be careful about the difference between binding
and assignment when dealing with closures of whatever kind.

Here's an example without involving nested generators.

def gen():
  for i in xrange(3):
    yield lambda: i
for f in gen(): print f()
for f in list(gen()): print f()

To fix the problem, you need to arrange for something to actually rebind
a variable around your inner generator on each iteration through.  Since
Python doesn't have any mechanism for controlling variable binding other
than defining functions, that's what you'll have to do.

def genfix():
  for i in xrange(3):
    def bind(i):
      def geninner():
        yield i
      return geninner()
    yield i, bind(i)

shows the general pattern, but since a generator has the syntactic form
of a function anyway, we can fold the two together.

def genfix2():
  for i in xrange(3):
    def geninner(i):
      yield i
    yield i, geninner(i)

Yes, this is cumbersome.  A `let' statement would help a lot.  Or
macros. ;-)

-- [mdw]



More information about the Python-list mailing list