On Tue, 12 Oct 2021 at 11:48, Chris Angelico firstname.lastname@example.org wrote:
On Tue, Oct 12, 2021 at 8:43 PM Oscar Benjamin email@example.com wrote:
A leaky StopIteration can wreak all sorts of havoc. There was a PEP that
attempted to solve this by turning StopIteration into RuntimeError if it gets caught in a generator but that PEP (which was rushed through very quickly IIRC) missed the fact that generators are not the only iterators. It remains a problem that leaking a StopIteration into map, filter etc will terminate iteration of an outer loop.
Generators are special because they never mention StopIteration. They are written like functions, but behave like iterators. That is why StopIteration leaking is such a problem.
Generators are a common case and are important so the PEP definitely helps. It is incomplete though because the problem remains for other cases. StopIteration is rarely mentioned anywhere e.g. there is nothing about it in the docstring for map: https://docs.python.org/3/library/functions.html#map
In every other situation, StopIteration is part of the API of what
you're working with. It is a bug to call next() without checking for StopIteration (or knowingly and intentionally permitting it to bubble).
Exactly: simple usage of next is often a bug. We need to be careful about this every time someone suggests that it's straight-forward to do next(iter(obj)).
The culprit for the problem of leaking StopIteration is next itself
which in the 1-arg form is only really suitable for use when implementing an iterator and not for the much more common case of simply wanting to extract something from an iterable. Numerous threads here and on stackoverflow and elsewhere suggesting that you can simply use next(iter(obj)) are encouraging bug magnet code. Worse, the bug when it arises will easily manifest in something like silent data loss and can be hard to debug.
That's no worse than getattr() and AttributeError. If you call getattr and you aren't checking for AttributeError, then you could be running into the exact same sorts of problems, because AttributeError is part of the function's API.
The difference is that you usually don't try to catch AttributeError in a higher up frame. A function that leaks StopIteration is not iterator-safe and can not be used with functional iterator tools like map. The exact reason for the danger of bare next is not obvious even to experienced Python programmers. Before the discussions around the PEP I had pointed it out several times and saw experienced commenters on lists like this being confused about what exactly the problem was. Maybe I'm not good at explaining myself but if the problem was obvious then it shouldn't have needed careful explanation.
The real advantage of providing first (or "take" or any of the other
names that have been proposed in the past) is that it should raise a different exception like ValueError so that it would be safe to use by default.
ValueError is no safer. The first() function would have, as its API, "returns the first element or raises ValueError if there is none". So now the caller of first() has to use try/except to handle the case where there is no value. Failing to do so is *just as buggy* as leaking a StopIteration.
A leaky StopIteration is a majorly confusing bug inside a __next__ function, because StopIteration is part of that function's API.
On the contrary: a __next__ function is the only place where it could possibly be valid to raise StopIteration. The fact that next raises StopIteration which passes through to the caller can be useful in this situation and this situation alone: https://github.com/python/cpython/blob/b37dc9b3bc9575adc039c6093c643b7ae5e91...
In any other situation it would be better to call first() and have something like ValueError instead.