On Tue, Sep 14, 2021 at 11:44 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Sep 14, 2021 at 09:38:38PM -0700, Guido van Rossum wrote:

> > I don't know what I would call an object that only has __next__,
> > apart from "broken" :-(
> >
>
> It's still an iterator, since it duck-types in most cases where an iterator
> is required (notably "for", which is the primary use case for the iteration
> protocols -- it's in the first sentence of PEP 234's abstract).

I don't think it duck-types as an iterator. Here's an example:


class A:
    def __init__(self): self.items = [1, 2, 3]
    def __next__(self):
        try: return self.items.pop()
        except IndexError: raise StopIteration


class B:
    def __iter__(self):
        return A()


It's fine to iterate over B() directly, but you can't iterate over
A() at all. If you try, you get a TypeError:

    >>> for item in A():  pass
    ...
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'A' object is not iterable

Yes, we all understand that. The reason I invoked "duck typing" is that as long as you don't use the iterator in a situation where iter() is called on it, it works fine. Just like a class with a readline() method works fine in some cases where a file is expected.
 
In practice, this impacts some very common techniques. For instance,
pre-calling iter() on your input.


    >>> x = B()
    >>> it = iter(x)
    >>> for value in it:  pass
    ...
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'A' object is not iterable


There are all sorts of reasons why one might pre-call iter(). One common
one is to pre-process the first element:

    it = iter(obj)
    first = next(obj, None)
    for item in it: ...

Another is to test for an iterable. iter(obj) will raise TypeError if
obj is not a sequence, collection, iterator, iterable etc.

Another is to break out of one loop and then run another:

    it = iter(obj)
    for x in it:
        if condition: break
        do_something()

    for x in it:
        something_else()


I'm sure there are others I haven't thought of.

No-one is arguing that an iterator that doesn't define __iter__ is great. And the docs should continue to recommend strongly to add an __iter__ method returning self.

My only beef is with over-zealous people who might preemptively want to reject an iterator at runtime that only has __next__; in particular "for" and iter() have no business checking for this attribute ("for" only needs __next__, and iter() only should check for the minimal version of the protocol to reject things without __next__).
 
I believe that iterable objects that define `__next__` but not
`__iter__` are fundamentally broken. If they happen to work in some
circumstances but not others, that's because the iterator protocol is
relaxed enough to work with broken iterators :-)

Your opinion is loud and clear. I just happen to disagree.

--
--Guido van Rossum (python.org/~guido)