On Tue, Oct 12, 2021 at 10:24 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Tue, 12 Oct 2021 at 11:48, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 12, 2021 at 8:43 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
A leaky StopIteration can wreak all sorts of havoc. There was a PEP that attempted to solve this by turning StopIteration into RuntimeError if it gets caught in a generator but that PEP (which was rushed through very quickly IIRC) missed the fact that generators are not the only iterators. It remains a problem that leaking a StopIteration into map, filter etc will terminate iteration of an outer loop.
Generators are special because they never mention StopIteration. They are written like functions, but behave like iterators. That is why StopIteration leaking is such a problem.
Generators are a common case and are important so the PEP definitely helps. It is incomplete though because the problem remains for other cases. StopIteration is rarely mentioned anywhere e.g. there is nothing about it in the docstring for map: https://docs.python.org/3/library/functions.html#map
If you want to report it as a bug in map(), feel free to do so. It's not a general issue to be solved. I would say that this version of map() is naive, and that version is safe: class map_naive: def __init__(self, func, it): self.func = func; self.it = iter(it) def __iter__(self): return self def __next__(self): return self.func(next(self.it)) class map_safe: def __init__(self, func, it): self.func = func; self.it = iter(it) def __iter__(self): return self def __next__(self): value = next(self.it) try: return self.func(value) except StopIteration: raise ValueError("StopIteration raised by map function") def map_alsosafe(func, it): for value in it: yield func(value) The distinction between naive and safe is *inside the definition of __next__*, and nowhere else. The fault isn't in the function that you pass to map, any more than having it raise AttributeError would be a fault. The reason generators are special is that, despite not having __next__ visible anywhere, they still have that same consideration. That's why they automatically transform StopIterations.
In every other situation, StopIteration is part of the API of what you're working with. It is a bug to call next() without checking for StopIteration (or knowingly and intentionally permitting it to bubble).
Exactly: simple usage of next is often a bug. We need to be careful about this every time someone suggests that it's straight-forward to do next(iter(obj)).
Yes, but "give me the first entry" is underspecified anyway. What SHOULD happen if there is no first entry? Is ValueError particularly different? If you do the naive thing and leak StopIteration, most likely it'll end up on the console.
The culprit for the problem of leaking StopIteration is next itself which in the 1-arg form is only really suitable for use when implementing an iterator and not for the much more common case of simply wanting to extract something from an iterable. Numerous threads here and on stackoverflow and elsewhere suggesting that you can simply use next(iter(obj)) are encouraging bug magnet code. Worse, the bug when it arises will easily manifest in something like silent data loss and can be hard to debug.
That's no worse than getattr() and AttributeError. If you call getattr and you aren't checking for AttributeError, then you could be running into the exact same sorts of problems, because AttributeError is part of the function's API.
The difference is that you usually don't try to catch AttributeError in a higher up frame. A function that leaks StopIteration is not iterator-safe and can not be used with functional iterator tools like map. The exact reason for the danger of bare next is not obvious even to experienced Python programmers. Before the discussions around the PEP I had pointed it out several times and saw experienced commenters on lists like this being confused about what exactly the problem was. Maybe I'm not good at explaining myself but if the problem was obvious then it shouldn't have needed careful explanation.
Nor do you usually catch StopIteration. There are very very few cases where a StopIteration will silently truncate something, and they are all cases where the function should probably be changed. In user code, it's the rule of thumb that I described: be aware of StopIteration when writing __next__ or calling next(), otherwise it shouldn't be a problem. The problem is most definitely NOT obvious, because most situations are simply *not a problem*, and most of the ones that ARE a problem would still be just as much of a problem with any other exception.
The real advantage of providing first (or "take" or any of the other names that have been proposed in the past) is that it should raise a different exception like ValueError so that it would be safe to use by default.
ValueError is no safer. The first() function would have, as its API, "returns the first element or raises ValueError if there is none". So now the caller of first() has to use try/except to handle the case where there is no value. Failing to do so is *just as buggy* as leaking a StopIteration.
A leaky StopIteration is a majorly confusing bug inside a __next__ function, because StopIteration is part of that function's API.
On the contrary: a __next__ function is the only place where it could possibly be valid to raise StopIteration. The fact that next raises StopIteration which passes through to the caller can be useful in this situation and this situation alone: https://github.com/python/cpython/blob/b37dc9b3bc9575adc039c6093c643b7ae5e91...
In any other situation it would be better to call first() and have something like ValueError instead.
Yes, but that's an example of __next__ specifically chaining to next() - exactly like defining __getattr__ to look for an attribute of something else (maybe you're writing a proxy of some sort). You expect that a bubbling-up exception is fundamentally equivalent to one you raise yourself. Please give a real example of where calling first() and getting ValueError is safer than calling next(iter(x)) and getting StopIteration. So far, I am undeterred in believing that the two exceptions have equivalent effect if the caller isn't expecting them. ChrisA