[Tutor] I am terribly confused about "generators" and "iterators".. Help me

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Sat Oct 28 23:21:19 CEST 2006


> My understanding is that an iterator is basically a facade` pattern. 
> If you aren't familiar with patterns, a facade` pattern basically makes 
> something really easy to use or convenient.  Yes, you can "do it by 
> hand", and many times that is indeed the preferred method, but sometimes 
> it's easier to use an iterator...particularly if it isn't a special 
> case.  Nothing in the manual says you HAVE to use iterators, but they 
> *can* make life easier.


Concretely, we can write a program that will find all the even-length 
strings in some sequence, and give it back to us as a list:

################################
def filterEvenStrings(sequence):
     results = []
     for x in sequence:
         if len(x) % 2 == 0:
             results.append(x)
     return results
################################


Now, we can use this function on a list of strings, as one would expect:

###########################################################
>>> filterEvenStrings('hello world this is a test'.split())
['this', 'is', 'test']
###########################################################


But we can also use this on a file:

##########################################################################
>>> evenWords = filterEvenStrings(
...             [x.strip()
...              for x in open('/usr/share/dict/words').readlines()])
>>> evenWords[:10]
['aa', 'Aani', 'aardvark', 'aardwolf', 'Aaronite', 'Aaru', 'Ab', 'Ababua', 
'abac', 'abacay']
##########################################################################

Note the nice thing here: we've been able to reuse filterEvenStrings on 
two entirely different things!  We can use the same function on different 
things because those two things support a common interface: "iteration" 
support.


Most of Python's interesting data structures have built-in iterators. 
But there are plenty of things that don't by default, but for which we 
might want to add such iteration support.

'for' loops expect things that iterate: if we try to apply them on things 
that don't, we'll get an error.  For example:

##############################################
>>> class Candy:
...     def __init__(self, name):
...         self.name = name
...
>>> class CandyBag:
...     def __init__(self):
...         self.contents = []
...     def addCandy(self, aCandy):
...         if isinstance(aCandy, Candy):
...             self.contents.append(aCandy)
...
###############################################

Ok, so we can make a CandyBag, and we might like to start adding things to 
it:

#######
>>> bag = CandyBag()
>>> bag.addCandy(42)
>>> bag.addCandy(Candy("snickers"))
>>> bag.addCandy(Candy("salt water taffy"))
>>> bag.addCandy("spam")
#######

We'd expect 42 and spam to be ignored, because they're not Candy.  Anyway, 
so let's go over the bag with a loop:

#######################################
>>> for c in bag:
...     print c
...
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: iteration over non-sequence
#######################################

This bag is not a sequence.  It may CONTAIN a sequence, but it itself 
isn't one.  Now, we can either pull out the bag.contents if we're naughty, 
but we want to be good: let's have CandyBag support iteration.

########################################
class CandyBag:
     def __init__(self):
         """Creates an empty CandyBag."""

         self.contents = []
     def addCandy(self, aCandy):
         """Adds aCandy if it's a Candy.  Otherwise, ignore it."""
         if isinstance(aCandy, Candy):
             self.contents.append(aCandy)

     def __iter__(self):
         """Returns an iterator to all the candy in us."""
         return iter(self.contents)
########################################


We've added a method __iter__() that produces an iterator when requested. 
In this case, we'll reuse the built-in iterator for lists.  The iter() 
function takes anything that supports iteration, and gives us a iterator:

######
>>> myiterator = iter([3, 1, 4])
>>> myiterator
<listiterator object at 0x228abb0>
>>> myiterator.next()
3
>>> myiterator.next()
1
>>> myiterator.next()
4
>>> myiterator.next()
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
StopIteration
######


And now that CandyBags support iteration, we can satisfy our sweet tooth 
and loop over a CandyBag using our for loop:

###########################################
>>> bag = CandyBag()
>>> bag.addCandy(42)
>>> bag.addCandy(Candy("snickers"))
>>> bag.addCandy(Candy("salt water taffy"))
>>> bag.addCandy("spam")
>>> for c in bag:
...     print c
...
<__main__.Candy instance at 0x22893f0>
<__main__.Candy instance at 0x22896c0>
###########################################

There's our two pieces of candy.  We didn't add str() support to our 
Candy, so that's why they're printing in that peculiar form.



More information about the Tutor mailing list