yield

Isaac To kkto at csis.hku.hk
Sun Jan 18 05:07:42 CET 2004


>>>>> "km" == km  <km at mrna.tn.nic.in> writes:

    km> Hi all, i didnt understand the purpose of 'yield' keyword and the
    km> concept of 'generators' in python.  can someone explain me with a
    km> small example how generators differ from normal function calls?

Normally, when you define a function and call it, the code within the
function gets executed, until the function returns, and at that point the
function disappear altogether.  For example:

>>> def f():
...   a = 1
...   while a < 32:
...     print 'Hello, ' + str(a)
...     return a
...     a *= 2
... 
>>> f()
Hello, 1
1

Here we define f, and at the time of definition the statements of the
function won't be executed.  Instead it is executed when the function is
called.  Once the function returns, the function disappear completely, with
all the context in the function: all the statements after the return
statement (a*=2 here) won't be executed, and the "other iterations" of the
loop will no longer execute.  Here, after printing 'Hello, 1', the function
stops, returning a value 1 to the caller (which is printed by the Python
interpretor).

When a function contains the "yield" keyword, the story is somewhat
different.  If we simply replace the "return" keyword with the "yield"
keyword in the above example we get this:

>>> def f():
...   a = 1
...   while a < 32:
...     print 'Hello, ' + str(a)
...     yield a
...     a *= 2
... 
>>> f()
<generator object at 0x4022136c>

In other words, the statements within the function will not be invoked even
when we call f()!  Instead, calling f() returns you an object, which is said
to be a "generator" object.  The basic method of this object is "next".  For
example:

>>> g = f()
>>> g.next()
Hello, 1
1

So, the effect of f().next() in the function using "yield" is exactly the
same as f() in the function using "return".  The difference in the current
version is that we still have an object g, so... we can call next() again!

>>> g.next()
Hello, 2
2
>>> g.next()
Hello, 4
4
>>> g.next()
Hello, 8
8
>>> g.next()
Hello, 16
16
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
StopIteration

In other words, after the function f yields, the function and the execution
context has not disappear.  Instead it is still stored within the generator
object, waiting for you to call next() again.  At that point the function
continues its execution.  If during the execution the function yields
another value, g.next() returns the value of that value and f is stopped
again.  If during the execution the function returns, g.next() throws an
exception StopIteration (and will do that again if you call g.next() again).

This is a powerful construct: this is the only Python construct that let you
easily "remember" the dynamic execution structure.  (Most other language
lacks that facility so it is impossible to remember the dynamic execution
structure, short of packaging all the information you want in structure.)
In the above example, after the first execution of next(), the g object
remembers (in its "frame object", which can be located by g.gi_frame) what
local variables are defined and what values they hold, which line the
function is currently executing at, and what global variables are currently
visible:

>>> g = f()
>>> g.next()
Hello, 1
1
>>> dir(g.gi_frame)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__', 'f_back', 'f_builtins', 'f_code', 'f_exc_traceback', 'f_exc_type', 'f_exc_value', 'f_globals', 'f_lasti', 'f_lineno', 'f_locals', 'f_restricted', 'f_trace']
>>> g.gi_frame.f_locals
{'a': 1}
>>> g.gi_frame.f_lineno
5
>>> g.gi_frame.f_globals
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', '__doc__': None, 'g': <generator object at 0x4022146c>, 'f': <function f at 0x40219f44>}

The simplest way to use the construct, and also by far the most common use
of it, is to use it as an "iterator", i.e., to repeatedly call it in a loop
until it finishes throw a StopIteration exception.  E.g.,

>>> for i in f():
...   print i * 3 + 7
... 
Hello, 1
10
Hello, 2
13
Hello, 4
19
Hello, 8
31
Hello, 16
55

Here the "for" loop will call the next() method of the generator returned by
f() repeatedly, each time getting the returned value into the variable i and
execute the "body" of the for loop (which prints the value i*3+7).  Similar
things happen in list-comprehensions:

>>> [i * 3 + 7 for i in f()]
Hello, 1
Hello, 2
Hello, 4
Hello, 8
Hello, 16
[10, 13, 19, 31, 55]

Note that, apart from printing the 'Hello, n' messages, both of the above
have the end-effect that f() is the list [1, 2, 4, 8, 16].  So you can treat
a generator as a "lazy" list that (1) an element will be generated
"on-the-fly" when next() is called, and thus may be affected by the changes
in the environment like the global variables, and may in reverse affect the
environment like printing message and changing the global variables; and (2)
after producing the element, the element itself will not be kept by the
generator.

Regards,
Isaac.



More information about the Python-list mailing list