[Tutor] I am terribly confused about "generators" and "iterators".. Help me
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Sat Oct 28 23:21:19 CEST 2006
> My understanding is that an iterator is basically a facade` pattern.
> If you aren't familiar with patterns, a facade` pattern basically makes
> something really easy to use or convenient. Yes, you can "do it by
> hand", and many times that is indeed the preferred method, but sometimes
> it's easier to use an iterator...particularly if it isn't a special
> case. Nothing in the manual says you HAVE to use iterators, but they
> *can* make life easier.
Concretely, we can write a program that will find all the even-length
strings in some sequence, and give it back to us as a list:
################################
def filterEvenStrings(sequence):
results = []
for x in sequence:
if len(x) % 2 == 0:
results.append(x)
return results
################################
Now, we can use this function on a list of strings, as one would expect:
###########################################################
>>> filterEvenStrings('hello world this is a test'.split())
['this', 'is', 'test']
###########################################################
But we can also use this on a file:
##########################################################################
>>> evenWords = filterEvenStrings(
... [x.strip()
... for x in open('/usr/share/dict/words').readlines()])
>>> evenWords[:10]
['aa', 'Aani', 'aardvark', 'aardwolf', 'Aaronite', 'Aaru', 'Ab', 'Ababua',
'abac', 'abacay']
##########################################################################
Note the nice thing here: we've been able to reuse filterEvenStrings on
two entirely different things! We can use the same function on different
things because those two things support a common interface: "iteration"
support.
Most of Python's interesting data structures have built-in iterators.
But there are plenty of things that don't by default, but for which we
might want to add such iteration support.
'for' loops expect things that iterate: if we try to apply them on things
that don't, we'll get an error. For example:
##############################################
>>> class Candy:
... def __init__(self, name):
... self.name = name
...
>>> class CandyBag:
... def __init__(self):
... self.contents = []
... def addCandy(self, aCandy):
... if isinstance(aCandy, Candy):
... self.contents.append(aCandy)
...
###############################################
Ok, so we can make a CandyBag, and we might like to start adding things to
it:
#######
>>> bag = CandyBag()
>>> bag.addCandy(42)
>>> bag.addCandy(Candy("snickers"))
>>> bag.addCandy(Candy("salt water taffy"))
>>> bag.addCandy("spam")
#######
We'd expect 42 and spam to be ignored, because they're not Candy. Anyway,
so let's go over the bag with a loop:
#######################################
>>> for c in bag:
... print c
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iteration over non-sequence
#######################################
This bag is not a sequence. It may CONTAIN a sequence, but it itself
isn't one. Now, we can either pull out the bag.contents if we're naughty,
but we want to be good: let's have CandyBag support iteration.
########################################
class CandyBag:
def __init__(self):
"""Creates an empty CandyBag."""
self.contents = []
def addCandy(self, aCandy):
"""Adds aCandy if it's a Candy. Otherwise, ignore it."""
if isinstance(aCandy, Candy):
self.contents.append(aCandy)
def __iter__(self):
"""Returns an iterator to all the candy in us."""
return iter(self.contents)
########################################
We've added a method __iter__() that produces an iterator when requested.
In this case, we'll reuse the built-in iterator for lists. The iter()
function takes anything that supports iteration, and gives us a iterator:
######
>>> myiterator = iter([3, 1, 4])
>>> myiterator
<listiterator object at 0x228abb0>
>>> myiterator.next()
3
>>> myiterator.next()
1
>>> myiterator.next()
4
>>> myiterator.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration
######
And now that CandyBags support iteration, we can satisfy our sweet tooth
and loop over a CandyBag using our for loop:
###########################################
>>> bag = CandyBag()
>>> bag.addCandy(42)
>>> bag.addCandy(Candy("snickers"))
>>> bag.addCandy(Candy("salt water taffy"))
>>> bag.addCandy("spam")
>>> for c in bag:
... print c
...
<__main__.Candy instance at 0x22893f0>
<__main__.Candy instance at 0x22896c0>
###########################################
There's our two pieces of candy. We didn't add str() support to our
Candy, so that's why they're printing in that peculiar form.
More information about the Tutor
mailing list