Change StopIteration handling inside generator-like builtins

PEP 479 (https://www.python.org/dev/peps/pep-0479/) changed the rules around generators: if it would have leaked StopIteration, it instead raises RuntimeError. This converts hard-to-debug premature termination into easily-spotted exceptions, but it applies only to actual generator functions. Implementing a map-like function as a generator means you're safe: def poof(x): if x == 5: raise StopIteration # Leak! return x * 3 def mymap(func, it): for value in it: yield func(value) for n in mymap(poof, range(10)): print(n) Boom! RuntimeError that highlights the source of the StopIteration. Similarly, using a list comprehension is safe: [poof(n) for n in range(10)] # raises StopIteration But using the built-in map() will silently terminate: for n in map(poof, range(10)): print(n) I propose to grant PEP 479 semantics - namely, that a StopIteration during the calling of the mapped function be translated into a RuntimeError. Likewise for filter(), guarding the predicate function, and all similar functions in itertools: accumulate, filterfalse, takewhile/dropwhile, starmap, and any that I didn't notice. ChrisA

On Sun, 8 Dec 2019 at 14:37, Chris Angelico <rosuav@gmail.com> wrote:
[snip]
The problem here is that in the case of generators there is a single place that you can fix this in the implementation of generators in the interpreter as part of the definition of the language. With iterators there are many iterator tools including in third party code outside of the stdlib. They won't all be fixed so it would remain the case that bare next should be discouraged in most uses and that there isn't a drop-in replacement for someone who just wants to use next without a default value. I think you will also find that much code is depending on map etc to behave in the current way. When learning the iterator protocol it seemed to me that being able to raise StopIteration from anywhere was a design feature. -- Oscar

On Mon, Dec 9, 2019 at 1:57 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Yes, not everything can be fixed. I don't think that means we shouldn't fix the things that ARE under our control - that is, the standard library. If someone's using next without a default value, there are three likely possibilities: 1) The intention was specifically to reraise the StopIteration. Applicable only within a __next__ function, and still valid. 2) The intention was to have some other behaviour, and it's wrapped in try/except right at the call site. 3) The programmer never even thought about it, and is assuming the iterator is not empty/exhausted. Most consumers of iterables should be handling StopIteration right there (either by using a for loop, or with an actual try/except). The case of a first() function that led to this proposal is a perfect example: it'd be correct for it to raise ValueError on an empty iterable, so it should try/except and raise a different exception. The trouble is that the third case is an extremely subtle one. It's all very well to state in the docs that "bare next [is] discouraged in most uses", but people will do it, and the correct thing to do is to report it as an exception. And that's exactly what happens in most situations. In the body of a for loop, or in a list comp, the StopIteration comes right on out as an exception. In a generator, it becomes RuntimeError. The ONLY problem is when a __next__ function has code in it that could raise/leak StopIteration, and doesn't catch that.
From inside a map function? I'd like to see some examples here - a function that's designed to be called from inside map, which deliberately raises/leaks StopIteration, intending to halt the map cleanly. (Or equivalent with filter etc.) Be aware that code like this would break if spelled any way other than map(): def poof(x): if x == 5: raise StopIteration return x * 3 short_list = list(map(poof, range(10))) boom = [poof(x) for x in range(10)] boom = list(poof(x) for x in range(10)) for x in range(10): print(poof(x)) # boom I'd call code like this "fragile". ChrisA

On Sun, 8 Dec 2019 at 14:37, Chris Angelico <rosuav@gmail.com> wrote:
[snip]
The problem here is that in the case of generators there is a single place that you can fix this in the implementation of generators in the interpreter as part of the definition of the language. With iterators there are many iterator tools including in third party code outside of the stdlib. They won't all be fixed so it would remain the case that bare next should be discouraged in most uses and that there isn't a drop-in replacement for someone who just wants to use next without a default value. I think you will also find that much code is depending on map etc to behave in the current way. When learning the iterator protocol it seemed to me that being able to raise StopIteration from anywhere was a design feature. -- Oscar

On Mon, Dec 9, 2019 at 1:57 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Yes, not everything can be fixed. I don't think that means we shouldn't fix the things that ARE under our control - that is, the standard library. If someone's using next without a default value, there are three likely possibilities: 1) The intention was specifically to reraise the StopIteration. Applicable only within a __next__ function, and still valid. 2) The intention was to have some other behaviour, and it's wrapped in try/except right at the call site. 3) The programmer never even thought about it, and is assuming the iterator is not empty/exhausted. Most consumers of iterables should be handling StopIteration right there (either by using a for loop, or with an actual try/except). The case of a first() function that led to this proposal is a perfect example: it'd be correct for it to raise ValueError on an empty iterable, so it should try/except and raise a different exception. The trouble is that the third case is an extremely subtle one. It's all very well to state in the docs that "bare next [is] discouraged in most uses", but people will do it, and the correct thing to do is to report it as an exception. And that's exactly what happens in most situations. In the body of a for loop, or in a list comp, the StopIteration comes right on out as an exception. In a generator, it becomes RuntimeError. The ONLY problem is when a __next__ function has code in it that could raise/leak StopIteration, and doesn't catch that.
From inside a map function? I'd like to see some examples here - a function that's designed to be called from inside map, which deliberately raises/leaks StopIteration, intending to halt the map cleanly. (Or equivalent with filter etc.) Be aware that code like this would break if spelled any way other than map(): def poof(x): if x == 5: raise StopIteration return x * 3 short_list = list(map(poof, range(10))) boom = [poof(x) for x in range(10)] boom = list(poof(x) for x in range(10)) for x in range(10): print(poof(x)) # boom I'd call code like this "fragile". ChrisA
participants (3)
-
Chris Angelico
-
Oscar Benjamin
-
Soni L.