[Tutor] lazy? vs not lazy? and yielding
Dave Angel
davea at ieee.org
Wed Mar 3 21:41:34 CET 2010
John wrote:
> Hi,
>
> I just read a few pages of tutorial on list comprehenion and generator
> expression. From what I gather the difference is "[ ]" and "( )" at the
> ends, better memory usage and the something the tutorial labeled as "lazy
> evaluation". So a generator 'yields'. But what is it yielding too?
>
> John
>
>
A list comprehension builds a whole list at one time. So if the list
needed is large enough in size, it'll never finish, and besides, you'll
run out of memory and crash. A generator expression builds a function
instead which *acts* like a list, but actually doesn't build the values
till you ask for them. But you can still do things like
for item in fakelist:
and it does what you'd expect.
You can write a generator yourself, and better understand what it's
about. Suppose you were trying to build a "list" of the squares of the
integers between 3 and 15. For a list of that size, you could just use
a list comprehension. But pretend it was much larger, and you couldn't
spare the memory or the time.
So let's write a generator function by hand, deliberately the hard way.
def mygen():
i = 3
while i < 16:
yield i*i
i += 1
return
This function is a generator, by virtue of that yield statement in it.
When it's called, it does some extra magic to make it easy to construct
a loop.
If you now use
for item in mygen():
print item
Each time through the loop, it executes one more iteration of the
mygen() function, up to the yield statement. And the value that's put
into item comes from the yield statement.
When the mygen() function returns (or falls off the end), it actually
generates a special exception that quietly terminates the for/loop.
Now, when we're doing simple expressions for a small number of values,
we should use a list comprehension. When it gets big enough, switch to
a generator expression. And if it gets complicated enough, switch to a
generator function. The point here is that the user of the for/loop
doesn't care which way it was done.
Sometimes you really need a list. For example, you can't generally back
up in a generator, or randomly access the [i] item. But a generator is
a very valuable mechanism to understand.
For a complex example, consider searching a hard disk for a particular
file. Building a complete list might take a long time, and use a lot of
memory. But if you use a generator inside a for loop, you can terminate
(break) when you meet some condition, and the remainder of the files
never had to be visited. See os.walk()
DaveA
More information about the Tutor
mailing list