[Edu-sig] Some design patterns with iterators

Sun, 30 Mar 2003 23:16:22 -0800

At 07:44 PM 3/30/2003 -0500, Guido van Rossum wrote:
....

Your comments highly welcomed and greatly appreciated!

>--
>
>That's a contorted example.  Why not change this a bit so that instead
>of __call__ it has a next method that raises StopIteration when
>exhausted, and an __iter__ method returning self?

Agreed.  Students should realize that objects don't need to be wrapped by
iter() in order to perform as iterators.  It's sufficient to define a next()
method internally, and then assign the object itself as the return of __iter__
-- as you point out.

It's also possible to equate __call__ to next, so if the programmer *does*
happen to wrap the object inside iter(), even with a sentinel, it won't 
break --
although the sentinel is irrelevant in this pattern, because now the test
raises an exception instead.

class Fib2(object):
     def __init__(self,start=0,next=1,test=lambda x: False):
         # note revised default for test -- previous default of
         # 'False' was not callable
         self.start = start
         self.next  = next
         self.test  = test

     def next(self):
         self.start,self.next = self.next, self.start + self.next
         if self.test(self.start): raise StopIteration
         return self.start

     __call__ = next  # makes object callable

     def __iter__(self):
         return self

def testfunc(max):
     """a factory for test functions"""
     def t(x):
         return x > max
     t.__doc__ = "stop when x > %s" % max
     return t

t1 = testfunc(1000)
t2 = testfunc(5000)

def looper(iterable):
     for i in iterable:
         print i,
     print  # newline when done

def tests():
     print t1.__doc__
     looper(Fib2(test=t1))  # note:  no iter() wrapper -- Fib2 is already 
iterable
     print t2.__doc__
     looper(iter(Fib2(test=t2),-1))  # but we can wrap it without problems
     print t2.__doc__
     looper(iter(Fib2(test=t2)))     # with or without a sentinel

Having a sentinel where it's not needed is of course a sign of programmer 
confusion.
But one might argue the object itself is "more robust" this way.

> >   >>> def fibfunc(test):
> >          data = [1,1]
> >          while not test(data[0]):
> >             yield data[0]
> >             data[0],data[1] = data[1],data[0]+data[1]
> >          return
>
>Comment here: instead of using a two-element list (which could just as
>well be a tuple!), use two variables.  Faster and clearer IMO:
>
>   def f(test):
>    last, next = 1, 1
>    while not test(last):
>      yield last
>      last, next = next, last+next

Yes, clearer.  The list-based approach was left over from a version which
makes the initial two-value seed a parameter as in:

def iterseq(seed=[0,1],test=lambda x: False):
    # but even so, it's easier to read with two variables:
    last,next = seed
    while not test(last):
        yield last
        last, next = next, last+next

With this new freedom we can generate the Lucas Numbers, with seed
[2,1] instead of [1,1]:
http://www.mcs.surrey.ac.uk/Personal/R.Knott/Fibonacci/lucasNbs.html

def tests():
     #...
     print "Lucas Numbers"
     looper(iterseq([2,1],t1))

 >>> iters.tests()
...
Lucas Numbers
2 1 3 4 7 11 18 29 47 76 123 199 322 521 843

A modification lets us iterate until a last/next ratio is reached through
convergence -- need to be careful here, as infinite loops often result from
equating floating point numbers.

def iterseq2(seed=[1,1],test=lambda x,y: False):
    # variation, where test looks at both last and next
    last,next = seed
    while not test(last,next):
        yield "%s/%s" % (last,next)
        last, next = next, last+next

# probaby overkill to wrap a function here, but I'm wanting to *not* rely on a
# global value for phi, nor do I want to keep re-evaluating a square root with
# every test -- so I pass it in explicitly and build a function around it.

def testratio(terminus):
     def t(inta,intb):
          return (intb/inta) == terminus # uses new division
          # it'd be safer to test for abs(difference) < tolerance
     return t

t3 = testratio(0.5*(1+pow(5,.5)))  # phi = (1 + sqrt(5))/2

def tests():
     # ...
     print "Converging..."
     looper(iterseq2([1,1],t3))   # Fibonacci seed, convergence test

Converging...
1/1 1/2 2/3 3/5 5/8 8/13 13/21 21/34 34/55 55/89 89/144 144/233 233/377 
377/610 610/987 987/1597 1597/2584 2584/4181 4181/6765 6765/10946 
10946/17711 17711/28657 28657/46368 46368/75025 75025/121393 121393/196418 
196418/317811 317811/514229 514229/832040 832040/1346269 1346269/2178309 
2178309/3524578 3524578/5702887 5702887/9227465 9227465/14930352 
14930352/24157817 24157817/39088169 39088169/63245986 63245986/102334155

The Lucas Number next/last ratios likewise converge to phi.

>Are your students familiar with functions as first-order objects at
>this point?  It would be simpler if you could simply pass the limit as
>the argument to Fib() rather than having to construct a test function
>first.  Sure, a test function allows more freedom, but the API design
>here doesn't actually make it much easier to do something different in
>the test function -- e.g. another useful termination test would be to
>produce the first N numbers, but that would require the test function
>to be stateful, and the API doesn't promise that test() is only called
>once per iteration.

Yes, I'm sort of demonstrating two concepts -- iterators, and the fact that
functions may be passed as parameters.  There's more convolution as a result.

Plus the above example points up a weakness in this design:  the passed-in
function is relying on a hard-wired set of arguments, is in test(last),
whereas we might want access to both last and next for our test.  Our only
access to the local scope, within the generator function, is based on the
arguments to test() -- a limit on our freedom.

I do find it interesting that objects don't suffer as much from this 
limitation,
as one has access to all the object's instance variables through 'self' -- so
it's possible to write a top-level test function knowing that 'self' is the
only argument needed -- yet internal to the test function, you may access
self's scope at will:

class Fib3(object):
     def __init__(self,start=0,next=1,test=lambda x: False):
         self.start = start
         self.next  = next
         self.test = test

     def next(self):
         self.start,self.next = self.next, self.start + self.next
         if self.test(self):  # single parameter sufficient to access __dict__
              raise StopIteration
         return self.start

     __call__ = next

     def __iter__(self):
         return self

def testratio2(terminus):
     def t(obj):  # obj will be the self of the enclosing object
          return obj.next/obj.start == terminus # uses new division
     return t

But we're just getting more and more convoluted here (useful learning on
my end though) ...

Kirby