need help on need help on generator...

Alex Martelli aleaxit at yahoo.com
Sat Jan 22 04:10:36 EST 2005


Francis Girard <francis.girard at free.fr> wrote:
   ...
> But besides the fact that generators are either produced with the new "yield"
> reserved word or by defining the __new__ method in a class definition, I
> don't know much about them.

Having __new__ in a class definition has nothing much to do with
generators; it has to do with how the class is instantiated when you
call it.  Perhaps you mean 'next' (and __iter__)?  That makes instances
of the class iterators, just like iterators are what you get when you
call a generator.

> In particular, I don't know what Python constructs does generate a generator.

A 'def' of a function whose body uses 'yield', and in 2.4 the new genexp
construct.

> I know this is now the case for reading lines in a file or with the new
> "iterator" package.

Nope, besides the fact that the module you're thinking of is named
'itertools': itertools uses a lot of C-coded special types, which are
iterators but not generators.  Similarly, a file object is an iterator
but not a generator.

> But what else ?

Since you appear to conflate generators and iterators, I guess the iter
built-in function is the main one you missed.  iter(x), for any x,
either raises an exception (if x's type is not iterable) or else returns
an iterator.

> Does Craig Ringer answer mean that list 
> comprehensions are lazy ?

Nope, those were generator expressions.

> Where can I find a comprehensive list of all the 
> lazy constructions built in Python ?

That's yet a different question -- at least one needs to add the
built-in xrange, which is neither an iterator nor a generator but IS
lazy (a historical artefact, admittedly).

But fortunately Python's built-ins are not all THAT many, so that's
about it.

> (I think that to easily distinguish lazy 
> from strict constructs is an absolute programmer need -- otherwise you always
> end up wondering when is it that code is actually executed like in Haskell).

Encapsulation doesn't let you "easily distinguish" issues of
implementation.  For example, the fact that a file is an iterator (its
items being its lines) doesn't tell you if that's internally implemented
in a lazy or eager way -- it tells you that you can code afile.next() to
get the next line, or "for line in afile:" to loop over them, but does
not tell you whether the code for the file object is reading each line
just when you ask for it, or whether it reads all lines before and just
keeps some state about the next one, or somewhere in between.

The answer for the current implementation, BTW, is "in between" -- some
buffering, but bounded consumption of memory -- but whether that tidbit
of pragmatics is part of the file specs, heh, that's anything but clear
(just as for other important tidbits of Python pragmatics, such as the
facts that list.sort is wickedly fast, 'x in alist' isn't, 'x in adict'
IS...).


Alex



More information about the Python-list mailing list