[Tutor] __iter__: one obvious way to do it

Steven D'Aprano steve at pearwood.info
Sun Mar 7 17:37:35 CET 2010


On Mon, 8 Mar 2010 02:07:41 am spir wrote:
> [sorry, forgot the code]
>
> Hello,
>
> Below 6 working way to implement __iter__ for a container here
> simulated with a plain inner list. Sure, the example is a bit
> artificial ;-)

> 1. __iter__ returns a generator _expression_
>     def __iter__(self):
>         return (pair(n) for n in self.items)

Seems perfectly acceptable to me. That's syntactic sugar for the next 
one:


> 2. __iter__ *is* a generator
>     def __iter__(self):
>         for n in self.items:
>             yield pair(n)
>         raise StopIteration

As Stefan pointed out, __iter__ can be a generator, so that's okay too. 
However, the StopIteration at the end is redundant: generators 
automatically raise StopIteration when they "fall off the end" at the 
end of the code. So this is equivalent to the above:

def __iter__(self):
    for n in self.items:
        yield pair(n)


> 3. __iter__ returns a generator 
>    (this one is a bit weird, i guess)
>     def __iter__(self):
>         return self.pairs()

There's nothing weird about it. It's the difference between writing code 
directly inline, and calling it indirectly:

def f():
    return [1, 2, 3]

versus:

def indirect():
    return [1, 2, 3]

def f():
    return indirect()


If you have a good reason for the indirection, it is perfectly fine. 
E.g. you might have a class with multiple iterators:

class X:
    def iter_width(self):
        """Iterate left-to-right"""
        pass
    def iter_depth(self):
        """Iterate top-to-bottom"""
        pass
    def iter_spiral(self):
        pass
    def __iter__(self):  # Default.
        return self.iter_width()


> 4. __iter__ returns self, its own iterator via next()
>     def __iter__(self):
>         self.i=0
>         return self

That's how you would make the class an iterator directly.


> 5. __iter__ returns an external iterator object
>     def __iter__(self):
>         return Iter(self)

Built-ins such as lists, dicts, tuples and sets use that strategy:

>>> iter([1,2,3])
<listiterator object at 0xb7d08a4c>
>>> iter(dict(a=1,b=2))
<dictionary-keyiterator object at 0xb7d08fe0>



> 6. __iter__ returns iter() of a collection built just on time
>    (this one is really contrived)
>     def __iter__(self):
>         return iter(tuple([pair(n) for n in self.items]))

Not contrived, but inefficient.

First you build a list, all at once, using a list comprehension. So much 
for lazy iteration, but sometimes you have good reason for this (see 
below).

Then you copy everything in the list into a tuple. Why?

Then you create an iterator from the tuple.

If you remove the intermediate tuple, it is a reasonable approach for 
ensuring that you can modify the original object without changing any 
iterators made from it. In other words, __iter__ returns a *copy* of 
the data in self. But the easiest way to do that:

def __iter__(self):
    return iter([pair(n) for n in self.items])

No need to make a tuple first.



> Also, one can always traverse the collection (already existing or
> built then) itself if it not quasi-infinite (no __iter__ at all).

The point of __iter__ is to have a standard way to traverse data 
structures, so you can traverse them with for-loops. Otherwise, every 
data structure needs a different method:

for item in tree.traverse():

for item in mapping.next_key():

for item in sequence.get_next_item():

for item in obj.walker():



> "There should be one-- and preferably only one --obvious way to do
> it" http://www.python.org/dev/peps/pep-0020/

This doesn't mean that there should be *only* one way to do something. 
It means that the should be one OBVIOUS way to do it.



-- 
Steven D'Aprano


More information about the Tutor mailing list