
On Mon, Nov 29, 2021 at 12:11:43AM -0500, David Mertz, Ph.D. wrote:
On Sun, Nov 28, 2021, 11:43 PM Paul Bryan <pbryan@anode.ca> wrote:
According to https://docs.python.org/3/glossary.html#term-iterator and https://docs.python.org/3/library/stdtypes.html#typeiter, iterators must implement the __iter__ method.
From your first link:
CPython implementation detail: CPython does not consistently apply the requirement that an iterator define __iter__().
That comment is newly added to the documentation, it wasn't there in 3.9: https://docs.python.org/3.9/glossary.html#term-iterator and I don't think it should have been added. Rather than muddying the waters with a comment that CPython doesn't obey its own rules, I would rather we fixed the broken iterators so that they weren't broken. For Python classes, all it needs is a one line method. For C classes, I presume it's a bit more complex, but not that much. Does anyone know what builtin or stdlib objects iterators fail to implement `__iter__`? I haven't been able to find any -- all the obvious examples do (map, filter, reversed, zip, generators, list iterators, set iterators, etc).
obj = iter(zip('', '')) obj is iter(obj) True
The inconvenient truth here is that if you have an object that defines only `__next__`, you **cannot** iterate over it directly!
class BrokenIterator(object): ... def __next__(self): ... return 1 ... for i in BrokenIterator(): ... print(i) ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'BrokenIterator' object is not iterable
How is that an iterator when it doesn't support iteration? You can, sometimes, get away with such a broken iterator if you iterate over it *indirectly*, that it, you have a second class with an `__iter__` method which returns a BrokenIterator instance: class Container(object): def __iter__(self): return BrokenIterator() Now you can iterate over a Container instance: for i in Container(): print(i) break but the moment that somebody tries to use that as an actual iterator, for example by skipping the first element: it = iter(Container()) next(it, None) # discard the first element for i in it: print(i) it will blow up in their face. Would-be iterators that supply only `__next__` are broken. If you go back to the original PEP that introduced iterators in Python 2.1, it is clear: "Classes can define how they are iterated over by defining an __iter__() method; this should take no additional arguments and return a valid iterator object. A class that wants to be an iterator should implement two methods: a next() method that behaves as described above, and an __iter__() method that returns self." https://www.python.org/dev/peps/pep-0234/ The reference manual is correct. I quote: "One method needs to be defined for container objects to provide iterable support: ..." (that would be `__iter__`). "The iterator objects themselves are required to support the following two methods, which together form the iterator protocol: ..." (and they would be `__iter__` returning self, and `__next__`). https://docs.python.org/3/library/stdtypes.html#iterator-types
That said, I don't think the description at the link is very good. Anyway, it's different from what I teach,
Then you are teaching it wrong. Sorry.
and also different from how Python actually behaves. E.g.:
class Foo: ... def __iter__(self): ... return Bar()
Then Foo instances are *iterable* but they are not iterators.
...
class Bar: ... def __next__(self): ... if random() > 0.5: ... raise StopIteration ... return "Bar"
And Bar instances are broken iterators. [...]
Or anyway, what would you call `bar := Bar()` if not "an iterator?!
A broken iterator which cannot be iterated over directly. -- Steve