[Tutor] lazy? vs not lazy? and yielding

Dave Angel davea at ieee.org
Wed Mar 3 21:41:34 CET 2010


John wrote:
> Hi,
>
> I just read a few pages of tutorial on list comprehenion and generator 
> expression.  From what I gather the difference is "[ ]" and "( )" at the 
> ends, better memory usage and the something the tutorial labeled as "lazy 
> evaluation".  So a generator 'yields'.  But what is it yielding too?  
>
> John
>
>   
A list comprehension builds a whole list at one time.  So if the list 
needed is large enough in size, it'll never finish, and besides, you'll 
run out of memory and crash.  A generator expression builds a function 
instead which *acts* like a list, but actually doesn't build the values 
till you ask for them.  But you can still do things like
    for item in  fakelist:

and it does what you'd expect.


You can write a generator yourself, and better understand what it's 
about.  Suppose you were trying to build a "list" of the squares of the 
integers between 3 and 15.  For a list of that size, you could just use 
a list comprehension.  But pretend it was much larger, and you couldn't 
spare the memory or the time.

So let's write a generator function by hand, deliberately the hard way.

def mygen():
    i = 3
    while i < 16:
        yield i*i
        i += 1
    return

This function is a generator, by virtue of that yield statement in it.  
When it's called, it does some extra magic to make it easy to construct 
a loop.

If you now use
     for item in mygen():
           print item

Each time through the loop, it executes one more iteration of the 
mygen() function, up to the yield statement.  And the value that's put 
into item comes from the yield statement.

When the mygen() function returns (or falls off the end), it actually 
generates a special exception that quietly terminates the for/loop.

Now, when we're doing simple expressions for a small number of values, 
we should use a list comprehension.  When it gets big enough, switch to 
a generator expression.  And if it gets complicated enough, switch to a 
generator function.  The point here is that the user of the for/loop 
doesn't care which way it was done.

Sometimes you really need a list.  For example, you can't generally back 
up in a generator, or randomly access the [i] item.  But a generator is 
a very valuable mechanism to understand.

For a complex example, consider searching a hard disk for a particular 
file.  Building a complete list might take a long time, and use a lot of 
memory.  But if you use a generator inside a for loop, you can terminate 
(break) when you meet some condition, and the remainder of the files 
never had to be visited.  See os.walk()

DaveA



More information about the Tutor mailing list