[Tutor] iterators
Peter Otten
__peter__ at web.de
Sat Jan 18 10:50:21 CET 2014
Keith Winston wrote:
> I don't really get iterators. I saw an interesting example on
> Stackoverflow, something like
>
> with open('workfile', 'r') as f:
> for a, b, c in zip(f, f, f):
> ....
>
> And this iterated through a, b, c assigned to 3 consecutive lines of
> the file as it iterates through the file. I can sort of pretend that
> makes sense, but then I realize that other things that I thought were
> iterators aren't (lists and the range function)... I finally succeeded
> in mocking this up with a generator:
>
> gen = (i for i in range(20))
> for t1, t2, t3 in zip(gen, gen, gen):
> print(t1, t2, t3)
>
> So I'm a little more confident of this... though I guess there's some
> subtlety of how zip works there that's sort of interesting. Anyway,
> the real question is, where (why?) else do I encounter iterators,
> since my two favorite examples, aren't... and why aren't they, if I
> can iterate over them (can't I? Isn't that what I'm doing with "for
> item in list" or "for index in range(10)")?
You can get an iterator from a list or range object by calling iter():
>>> iter([1, 2, 3])
<list_iterator object at 0xeb0510>
>>> iter(range(10))
<range_iterator object at 0xeaccf0>
Every time you call iter on a sequence you get a new iterator, but when you
call iter() on an iterator the iterator should return itself.
An iter() call is done implicitly by a for loop, but every time you call
iter() on a sequence you get a new iterator object. So you can think of the
for-loop
for item in range(10):
print(item)
as syntactic sugar for
tmp = iter(range(10))
while True:
try:
item = next(tmp)
except StopIteration:
break
print(item)
Back to your zip() example. Here is a possible implementation of zip():
>>> def myzip(*iterables):
... iterators = [iter(it) for it in iterables]
... while True:
... t = tuple(next(it) for it in iterators)
... if len(t) < len(iterators):
... break
... yield t
...
When you pass it a range object twice there will be two distinct iterators
in the `iterators` list that are iterated over in parallel
>>> rfive = range(5)
>>> list(myzip(rfive, rfive))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
but when you pass the same iterator twice because
iter(some_iter) is some_iter
holds (i. e. the iter() function is "idempotent")
next(iterators[0])
and
next(iterators[1])
operate on the same iterator and you get the behaviour seen at
stackoverflow:
>>> ifive = iter(range(5))
>>> list(myzip(ifive, ifive))
[(0, 1), (2, 3)]
PS: There is an odd difference in the behaviour of list-comps and generator
expressions. The latter swallow Stopiterations which is why the above
myzip() needs the len() test:
>>> iterators = [iter(range(3))] * 10
>>> [next(it) for it in iterators]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
StopIteration
>>> iterators = [iter(range(3))] * 10
>>> tuple(next(it) for it in iterators)
(0, 1, 2)
With a list-comp myzip could be simplified:
>>> def myzip(*iterables):
... iterators = [iter(it) for it in iterables]
... while True:
... t = [next(it) for it in iterators]
... yield tuple(t)
...
>>> list(myzip(*[iter(range(10))]*3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8)]
More information about the Tutor
mailing list