generator function

Alex Martelli aleax at aleax.it
Thu Aug 7 09:17:50 EDT 2003


chansky wrote:

> I read the following link about generator:
> http://www.python.org/peps/pep-0255.html
> but I am still not so clear on the use/purpose of a generator function
> other than the fact that a generator can retain the state of the local
> variables within the fuction body.  Can someone out there shed some
> lights on this topic or share about how/when you would ever use a
> generator function.

You'll use a generator when you want to 'return', one after the other, a
sequence of results that are best computed sequentially.  The alternative
is to return the whole sequence of results at once, for example as a list,
but that may be an inferior choice if the number of results is large,
particularly if it's possible that the "consumer" of the results may
only be interested in a prefix of the whole sequence of results.

Suppose for example that you have a file, that you knos is open for binary
input but need not be seekable (e.g. it COULD be standard-input, a pipe,
etc etc); you know the file is made up of a number of chunks, with each
chunk being N bytes, and want to process the file sequentially by chunks.

I.e., the purpose is to be able to write:

# maybe after: thefile = open('whatever', 'rb')

for chunk in chunker(thefile, N):
    if process(chunk) == WE_ARE_DONE: break

Without generators, you could write chunker as a function returning a
list of chunks:

def chunker(afile, N):
    results = []
    while 1:
        chunk = afile.read(N)
        if not chunk: break
        results.append(chunk)
    return results

However, the list built and returned by this version of chunker is
potentially huge -- that could be a terrible waste of memory, and
of time, particularly (but not exclusively) if it's at all likely 
that processing the first few chunks may already find one causing
a WE_ARE_DONE return value from the (hypothetical) 'process' function.

So, a generator gives you an elegant alternative (in Python 2.3 or
with "from __future__ import generators" in 2.2!):

def chunker(afile, N):
    while 1:
        chunk = afile.read(N)
        if not chunk: break
        yield chunk

voila -- now the chunks are returned one at a time and there is
no waste of memory nor time.  What could be neater...?


Alex






More information about the Python-list mailing list