Map and filter should also convert StopIteration to RuntimeError

" exception StopIteration Raised by built-in function next() and an iterator‘s __next__() method to signal that there are no further items produced by the iterator." I have always taken this to mean that these are the only functions that should raise StopIteration, but other have not, and indeed, StopIteration in generator functions, which are not __next__ methods, has been accepted and passed on by generator.__next__. PEP 479 reverses this acceptance by having generator.__next turn StopIteration raised in a user-written generator function body into a RuntimeError. I propose that other builtin iterator.__next__ methods that execute a passed in function do the same. This proposal comes from Oscar Benjamin's comments on the PEP in the 'Generator are iterators' thread. In one post, he gave this example.
Function func violates what I think should be the guideline. When passed to map, the result is a silently buggy iterator. The doc says that map will "Return an iterator that applies function to every item of iterable, yielding the results.". If map cannot do that, because func raises for some item in iterable, map should also raise, but the exception should be something other than StopIteration, which signals normal non-buggy completion of its task. I propose that map.__next__ convert StopIteration raised by func to RuntimeError, just as generator.__next__ now does for StopIteration raised by executing a generator function frame. (And same for filter().) class newmap: "Simplified version allowing just one input iterable." def __iter__(self): return self def __init__(self, func, iterable): self.func = func self.argit = iter(iterable) def __next__(self): func = self.func args = next(self.argit) # pass on expected StopIteration if func is None: return args else: try: # new wrapper return func(args) except StopIteration: raise RuntimeError('func raised StopIteration') -- Terry Jan Reedy

On Sat, Dec 13, 2014 at 8:14 AM, Terry Reedy <tjreedy@udel.edu> wrote:
(I'm not sure why you have the "if func is None" check. Currently map() doesn't accept None as its function. But that could equally be implemented below if you wish.) I propose a much MUCH simpler version of map, then: def newmap(func, iterable): """Simplified version allowing just one input iterable.""" for val in iterable: yield func(val) Et voila! Conversion of StopIteration into RuntimeError. ChrisA

On 12/12/2014 4:21 PM, Chris Angelico wrote:
Because I initially tried to implement newmap with multiple iterators, I looked at the 2.7 itertools.imap equivalent Python code (removed from the doc in 3.x) and I forgot that None no longer works. The check and return should be removed.
Currently map() doesn't accept None as its function.
map.__init__ accepts None, but map.__next__ croaks trying to call it.
I wrote a class because map is a class with .__next__. A python implementation would better be a generator function, as with examples in the itertools doc. -- Terry Jan Reedy

On Fri, Dec 12, 2014 at 4:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I propose that map.__next__ convert StopIteration raised by func to RuntimeError
Aren't we stepping on a slippery slope here? What about say

On 12 December 2014 at 21:24, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
It applies to map, filter, and 5 or 6 things in itertools and certainly many other things outside stdlib. I guess the question is: should it be considered a bug for an iterator to leak a StopIteration outside of its "normal" exit condition? Apparently it was for generators which represent the majority of all iterators and it was considered sufficiently important to change the core language in a backward incompatible way. Oscar

On Fri, Dec 12, 2014 at 1:44 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I don't think anyone aid it was a bug -- often it is a bug, but that's use-case dependent. which represent the majority of all
iterators and it was considered sufficiently important to change the core language in a backward incompatible way.
It is considered prone to hard to find and understand bugs -- which was the motivation for the PEP. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12 December 2014 at 21:59, Chris Barker <chris.barker@noaa.gov> wrote:
I don't think that the positive use-cases of this feature are common.
The same reasoning applies here. I had a bug like this and it was data-dependent, hard to reproduce, and there was no exception to hook a debugger onto or stack trace to give information. Since the full program was quite slow and I had to keep running it in full it turned what would have been a 1 minute fix into a half of a day of debugging. Since then I've simply been wary about next() and haven't suffered the problem again. I suspect that in the same situation now it wouldn't take me so long because at the time I wasn't really aware of the possibility that StopIteration could get "caught" by a for-loop several frames up.

On 12/12/2014 02:11 PM, Oscar Benjamin wrote:
And now it will raise an exception at the point where the error actually occurred. Having gone through that experience I would have thought you would be more in favor of the change. -- ~Ethan~

On 12 December 2014 at 22:33, Ethan Furman <ethan@stoneleaf.us> wrote:
It won't because I wasn't using generators. The point II have been trying to make is that this isn't just about generators.
Having gone through that experience I would have thought you would be more in favor of the change.
On further reflection I am more in favour of the change. My initial thought was that it places the emphasis in the wrong place. I had concluded that the focus should be on the inappropriateness of next() and I still think that but that doesn't mean that the PEP isn't a good thing in and of itself.

On 12/12/2014 02:41 PM, Oscar Benjamin wrote:
FWIW I agree that the real culprit is next() -- just about any other function that we call will raise an error exception if something goes wrong, but in 'next's case, asking for the next item when there isn't one raises a flow-control exception instead of an EmptyIterable exception. Happily, we can write our own next() for our own modules (or even for built-ins if we're really adventurous!): #untested unsafe_next = builtins.next _unset_ = object() def next(iterator, default=_unset_): try: return unsafe_next(iterator) except StopIteration: if default is _unset_: raise EmptyIterable return default -- ~Ethan~

On 12/12/2014 4:24 PM, Alexander Belopolsky wrote:
I consider this a wretched 'function'. If an input value is outside the function's domain, making 'return value' impossible, the function should raise TypeError or ValueError. As it is, it mixes a function with a hard-coded takewhile. (A similar comment applies to the example I copied from Oscar.) The two 'functions' should be kept separate.
To me, this is a bug. accumulate should accumulate until the iterable is exhauted. (Or the doc should change to say that it accumulates until iterable is exhausted or func raises StopIteration.) If accumulate cannot continue, I think it should raise something other than StopIteration. The doc says that accumulate is equivalent to def accumulate(iterable, func=operator.add): it = iter(iterable) total = next(it) yield total for element in it: yield func(total, element) In 3.5, the StopIteration raised by f above will become RuntimeError. If accumulate.__next__ is not changed to match, then the equivalent would have to be given as def accumulate(iterable, func=operator.add): it = iter(iterable) total = next(it) yield total for element in it: try: yield func(total, element) except StopIteration: return Yes, similar considerations apply to all the itertools classes that call user functions: dropwhile, filterfalse, groupby, starmap, and takewhile. "itertools.dropwhile(predicate, iterable) Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element." If predicate raises before becoming false, dropwhile cannot 'return every element after'. Should dropwhile simply pass through StopIteration, falsely saying that there are no more elements, or should it convert StopIteration to an exception that says 'something is wrong'? (It could also ignore exceptions from predicate, but I do not suggest that.) -- Terry Jan Reedy

On 12/12/2014 7:36 PM, Greg Ewing wrote:
Terry Reedy wrote:
"I have always taken this to mean that [next and __next__] are the only functions that should raise StopIteration," ...
That such functions should catch StopIteration (most likely from next) is implied in the paragraph's opening sentence, repeated above. What the replacement should be is a different matter. There are often better choices than RuntimeError. -- Terry Jan Reedy

That does seem to be where this is headed :-( I'm -1 on this proposal. Generators are part of the language internals and language spec, so Guido can reasonably decide to take this in whatever direction he wants. In contrast, now you're moving on to parts of the language library that merely call functions and return results. It is not the responsibility of accumulate(), filter(), map(), or any other higher-order functions to impose rules about what those functions are allowed to do -- they can take as much time as they want, they can hold the GIL, they can manipulate signals, they can use tons of memory, they can return scalar values or complex objects, and they can raise any exception, but now StopIteration would become an event that gets special treatment. Further, this new made-up rule (for which there is zero demonstrated need in any language I know) would have to be applied throughout the standard library and possibly be extended to third-party code. It would be yet another language idiosyncrasy that would have to be learned, remembered, and StackOverflowed about for an eternity. In the case of generators, the PEP 479 rule is easily applied (not hard to search for or to mitigate). In contrast, the "fix" in this case would need to be applied to the *called* functions or their callees, possibly far removed from the map/filter/accumulate call. If that function is in third-party code or in a C-library, then the mitigation would require redesigning the call logic completely (not fun) or to wrap the function in something transforms a StopIteration into a custom exception and then re-catches the custom exception upstream from the higher-order function (also not fun). For anyone who has tested code that is currently working correctly but becomes broken by this proposal, the simplest mitigation will be for them to write their own variants of map/filter/accumulate/dropwhile/takewhile/groupby/starmap that just ignore this proposal and restore the behavior that has been happily in place for ages. Raymond

On 12/12/2014 10:10 PM, Raymond Hettinger wrote:
Completely agreed. While the problem (at least in some views, my own included) is really about how next() treats an empty iterable, the biggest reason against changing how next() works is because of all the correctly written code that would be broken (by "correct" I mean code that already guards against next() raising a StopIteration). Conversely, the biggest reason for changing just generators (again, at least in my view ;) is that when writing a generator it's easy to forget that you are actually mucking about with internals and should be guarding against an unexpected flow control exception occuring, as it were, out of the blue. -- ~Ethan~

On 13 December 2014 at 06:10, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
In contrast, the "fix" in this case would need to be applied to the *called* functions or their callees, possibly far removed from the map/filter/accumulate call. If that function is in third-party code or in a C-library, then the mitigation would require redesigning the call logic completely (not fun) or to wrap the function in something transforms a StopIteration into a custom exception and then re-catches the custom exception upstream from the higher-order function (also not fun).
Is it a common pattern to raise StopIteration from some deeply nested function (that isn't itself a __next__ method) in order to terminate iteration at some far removed level? I think that PEP 479 largely breaks that pattern if it is since it will no longer work if any generators are involved. Oscar

On Fri, Dec 12, 2014 at 10:10 PM, Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
Me too. PEP 479 was a clear win. __next__ methods are a much murkier area and we should not mess with them.
When it comes to map() and filter() and friends the situation is murkier yet. In Python 2, raising StopIteration in the function passed to map/filter was just bubbled out. In Python 3, where these have become lazy, a StopIteration raised by the function terminates the iteration. Was that intentional? I doubt it. Is it now a feature? Maybe (though it would make more sense to use something like takewhile). Could it mask a subtle bug? Probably (but it doesn't sound like a common situation -- map/filter functions are usually small and simple). Should we "fix" it? I don't think so. It's not sufficiently broken, the fix would lead us onto a slippery slope. Enough is enough.
Raymond
-- --Guido van Rossum (python.org/~guido)

On 12/13/2014 3:48 PM, Guido van Rossum wrote:
After reading other responses, even before reading this one, I decided that Alexander Belopolsky's initial comment "Aren't we stepping on a slippery slope here?" was correct, that generator.__next__ and generator functions are a sufficiently unique case to get unique treatment, and that I would withdraw my proposal. [The rest of Guido's post, which I have snipped, further explicates the slipperness.] I am also cognizant that creating incompatibilities between even 3.4 and 3.5 requires strong justification. So I consider this trial balloon popped. My proposal was a response to the problem of *silent breakage* of 'transform iterators' that take as arguments an iterator and transform function and produce a new, transformed sequence. While breaking functions with bad input generally falls under 'consenting adults', a transform function raising StopIteration is special in that it can easily lead to silent rather than noisy breakage. I have two alternate proposals. 1. In the doc entry for StopIteration, explicitly recommend against raising StopIteration outside of __next__ methods on the basis that doing so can lead to silent breakage of iterators. Then writing functions that do so anyway or passing them into iterators will clearly be a 'consenting adults' issue. 2. Make it easier to avoid accidentally leaking StopIteration when using next() by adding an option to have it raise a different exception. I already posted this idea to a different thread. -- Terry Jan Reedy

On Fri, Dec 12, 2014 at 04:14:07PM -0500, Terry Reedy wrote:
The term used by the docs is that it is "broken", which in this context has a technical meaning. An iterator is broken if it fails the rule that once it raises StopIteration, it will continue to always raise StopIteration. Whether or not map() and filter() return a "broken iterator" is completely at the mercy of the user's function. func() can silently break the iterator in many different ways: - it may call os._exit() or os.abort() - it may call threaded code which deadlocks - it may call time.sleep(2**10000) - it may enter an infinite loop to mention just a few. The documentation's promise that map() will return an iterator that "applies function to every item of iterable" is not an unconditional promise. It cannot possibly be, and neither map() nor Python can detect every possible failure condition in advance. It is not Python's responsibility to police that all iterators are non-broken. That is the responsibility of the coder. If you write func() such that it "breaks" map(), then you either have a good reason for doing so, or you are responsible for your own actions. We are all consenting adults here. Let's not complicate things in a futile attempt to prevent people from shooting themselves in the foot. There's a thread on python-dev at the moment decrying the difficulty of writing polylingual Python 2 + 3 code and how some people find it sucks all the joy out of writing Python code. Every backwards incompatible change we add just makes it more difficult to deal with the 2/3 transition. New features are an incentive to upgrade to 3.x. This is not an incentive to upgrade, but it will make it harder to write and understand polylingual iterator code. Gratuitously fixing perceived weaknesses in the iterator protocol which have been there since it was first introduced will, in my opinion, cause more pain than benefit. In another thread, Nick wrote to Oscar: The problem you think PEP 479 is trying to solve *is* the one where the discussion started, but it is *not* the one where it ended when Guido accepted PEP 479. The problems PEP 479 solves are generator specific - they have nothing to do with the iterator protocol in general. Let's leave the iterator protocol alone. -- Steven

On Sat, Dec 13, 2014 at 6:45 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Terry was suggesting fixing builtin iterators, not fixing all iterators or the iterator protocol. It *is* Python's responsibility to make sure the builtin iterators are not broken. It's fine and to be expected if user code raises StopIteration, but map and filter don't handle it correctly and therefore disobey the iterator protocol, which is a bug in map and filter -- not the callback. -- Devin

On Sun, Dec 14, 2014 at 6:02 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
I'm not sure that "fine and to be expected" is necessarily true. If the function given to map() raises StopIteration, I'd say any of these approaches is justifiable: 1) It's an error. Raise something other than StopIteration, so the caller knows something bad happened. (A generator version of map() will do this post-479.) 2) It's to be expected. Consider map() to have now terminated. (In other words, catch it and return.) 3) It's a signal that this one element should not be returned. Suppress it, yield nothing, and go get the next value from the input iterator. Personally, I'd rank them in this order of preference. Consistency with generators is a plus (especially in terms of citing a pure-Python equivalent), and the second and especially third options risk opening up a misfeature - using StopIteration as a weird form of flow control/signalling. But your description implies that option 2 is inherently obvious, which I'm not entirely convinced of; unless this is actually implemented as a generic "this iterator is now exhausted" flag, and next() itself sets and checks this flag, thus automatically enforcing that one StopIteration will result in an infinite stream of StopIterations. ChrisA

On Sat, Dec 13, 2014 at 11:51 AM, Chris Angelico <rosuav@gmail.com> wrote:
I worded myself poorly. It's not fine, it probably signals a bug in the callback (which is what makes #1 a reasonable option). What I mean is that callbacks can return any return value and raise any exception. Doing so shouldn't cause you to break the documented API of your type, including breaking its conformance to the iterator protocol. Anyway, I think we're on the same page, since I agree that your suggestions are justifiable, and you seem to agree that the status quo is not. This is not a "consenting adults"-type issue. map() and filter() don't correctly handle a perfectly plausible case, and as a result they violate their documented API. That is a bug. I'd rather fix the code than fix the docs. I can write a patch implementing Chris's #1, if it would be accepted. I don't think it needs a PEP. -- Devin

On Sun, Dec 14, 2014 at 11:47 PM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
Sure. Then yes, I think we are broadly in agreement.
Sounds fine to me, although without concrete use-cases I can't say it would be a huge practical improvement. It's still a nice theoretical improvement, tightening up a promise. ChrisA

On Sat, Dec 13, 2014, at 14:51, Chris Angelico wrote:
It's worth mentioning that #2 is, more or less,* precisely how both map and a pure-python equivalent behaved before PEP 479. Some people in this discussion considered this expected. Obviously that was not the end of the discussion. *Assuming that the code consuming the outer iterator will not attempt to read it again after a single StopIteration has been raised. A well-behaved iterator will continue to raise StopIteration forever after it has raised it once - an iterator whose StopIteration has been caused by some inner function will not.

On Tue, Dec 16, 2014 at 4:37 AM, <random832@fastmail.us> wrote:
Right, and it's that assumption that's the entire question here. This is buggy behaviour; the question is, is the bug map's problem (because it's failing to behave the way an iterator should), or is it simply a bug in the called function, which shouldn't be raising StopIteration? ChrisA

On 13 December 2014 at 07:14, Terry Reedy <tjreedy@udel.edu> wrote:
No, Oscar's proposal is based on a fundamental misunderstanding of the problem PEP 479 is solving. The PEP solves a generator specific problem with having "Two Ways to Do It" when it comes to terminating generator functions. It does this by way of the relatively straightforward expedient of having the PEP 380 approach completely replace the legacy approach, rather than retaining the status quo of offering both in parallel (my attempt at explaining this perspective in more detail: https://mail.python.org/pipermail/python-ideas/2014-December/030371.html). __next__ methods in general don't have this problem - there's only one way to signal termination from a __next__ method implementation, and that's raising StopIteration. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Dec 13, 2014 at 8:14 AM, Terry Reedy <tjreedy@udel.edu> wrote:
(I'm not sure why you have the "if func is None" check. Currently map() doesn't accept None as its function. But that could equally be implemented below if you wish.) I propose a much MUCH simpler version of map, then: def newmap(func, iterable): """Simplified version allowing just one input iterable.""" for val in iterable: yield func(val) Et voila! Conversion of StopIteration into RuntimeError. ChrisA

On 12/12/2014 4:21 PM, Chris Angelico wrote:
Because I initially tried to implement newmap with multiple iterators, I looked at the 2.7 itertools.imap equivalent Python code (removed from the doc in 3.x) and I forgot that None no longer works. The check and return should be removed.
Currently map() doesn't accept None as its function.
map.__init__ accepts None, but map.__next__ croaks trying to call it.
I wrote a class because map is a class with .__next__. A python implementation would better be a generator function, as with examples in the itertools doc. -- Terry Jan Reedy

On Fri, Dec 12, 2014 at 4:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I propose that map.__next__ convert StopIteration raised by func to RuntimeError
Aren't we stepping on a slippery slope here? What about say

On 12 December 2014 at 21:24, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
It applies to map, filter, and 5 or 6 things in itertools and certainly many other things outside stdlib. I guess the question is: should it be considered a bug for an iterator to leak a StopIteration outside of its "normal" exit condition? Apparently it was for generators which represent the majority of all iterators and it was considered sufficiently important to change the core language in a backward incompatible way. Oscar

On Fri, Dec 12, 2014 at 1:44 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I don't think anyone aid it was a bug -- often it is a bug, but that's use-case dependent. which represent the majority of all
iterators and it was considered sufficiently important to change the core language in a backward incompatible way.
It is considered prone to hard to find and understand bugs -- which was the motivation for the PEP. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12 December 2014 at 21:59, Chris Barker <chris.barker@noaa.gov> wrote:
I don't think that the positive use-cases of this feature are common.
The same reasoning applies here. I had a bug like this and it was data-dependent, hard to reproduce, and there was no exception to hook a debugger onto or stack trace to give information. Since the full program was quite slow and I had to keep running it in full it turned what would have been a 1 minute fix into a half of a day of debugging. Since then I've simply been wary about next() and haven't suffered the problem again. I suspect that in the same situation now it wouldn't take me so long because at the time I wasn't really aware of the possibility that StopIteration could get "caught" by a for-loop several frames up.

On 12/12/2014 02:11 PM, Oscar Benjamin wrote:
And now it will raise an exception at the point where the error actually occurred. Having gone through that experience I would have thought you would be more in favor of the change. -- ~Ethan~

On 12 December 2014 at 22:33, Ethan Furman <ethan@stoneleaf.us> wrote:
It won't because I wasn't using generators. The point II have been trying to make is that this isn't just about generators.
Having gone through that experience I would have thought you would be more in favor of the change.
On further reflection I am more in favour of the change. My initial thought was that it places the emphasis in the wrong place. I had concluded that the focus should be on the inappropriateness of next() and I still think that but that doesn't mean that the PEP isn't a good thing in and of itself.

On 12/12/2014 02:41 PM, Oscar Benjamin wrote:
FWIW I agree that the real culprit is next() -- just about any other function that we call will raise an error exception if something goes wrong, but in 'next's case, asking for the next item when there isn't one raises a flow-control exception instead of an EmptyIterable exception. Happily, we can write our own next() for our own modules (or even for built-ins if we're really adventurous!): #untested unsafe_next = builtins.next _unset_ = object() def next(iterator, default=_unset_): try: return unsafe_next(iterator) except StopIteration: if default is _unset_: raise EmptyIterable return default -- ~Ethan~

On 12/12/2014 4:24 PM, Alexander Belopolsky wrote:
I consider this a wretched 'function'. If an input value is outside the function's domain, making 'return value' impossible, the function should raise TypeError or ValueError. As it is, it mixes a function with a hard-coded takewhile. (A similar comment applies to the example I copied from Oscar.) The two 'functions' should be kept separate.
To me, this is a bug. accumulate should accumulate until the iterable is exhauted. (Or the doc should change to say that it accumulates until iterable is exhausted or func raises StopIteration.) If accumulate cannot continue, I think it should raise something other than StopIteration. The doc says that accumulate is equivalent to def accumulate(iterable, func=operator.add): it = iter(iterable) total = next(it) yield total for element in it: yield func(total, element) In 3.5, the StopIteration raised by f above will become RuntimeError. If accumulate.__next__ is not changed to match, then the equivalent would have to be given as def accumulate(iterable, func=operator.add): it = iter(iterable) total = next(it) yield total for element in it: try: yield func(total, element) except StopIteration: return Yes, similar considerations apply to all the itertools classes that call user functions: dropwhile, filterfalse, groupby, starmap, and takewhile. "itertools.dropwhile(predicate, iterable) Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element." If predicate raises before becoming false, dropwhile cannot 'return every element after'. Should dropwhile simply pass through StopIteration, falsely saying that there are no more elements, or should it convert StopIteration to an exception that says 'something is wrong'? (It could also ignore exceptions from predicate, but I do not suggest that.) -- Terry Jan Reedy

On 12/12/2014 7:36 PM, Greg Ewing wrote:
Terry Reedy wrote:
"I have always taken this to mean that [next and __next__] are the only functions that should raise StopIteration," ...
That such functions should catch StopIteration (most likely from next) is implied in the paragraph's opening sentence, repeated above. What the replacement should be is a different matter. There are often better choices than RuntimeError. -- Terry Jan Reedy

That does seem to be where this is headed :-( I'm -1 on this proposal. Generators are part of the language internals and language spec, so Guido can reasonably decide to take this in whatever direction he wants. In contrast, now you're moving on to parts of the language library that merely call functions and return results. It is not the responsibility of accumulate(), filter(), map(), or any other higher-order functions to impose rules about what those functions are allowed to do -- they can take as much time as they want, they can hold the GIL, they can manipulate signals, they can use tons of memory, they can return scalar values or complex objects, and they can raise any exception, but now StopIteration would become an event that gets special treatment. Further, this new made-up rule (for which there is zero demonstrated need in any language I know) would have to be applied throughout the standard library and possibly be extended to third-party code. It would be yet another language idiosyncrasy that would have to be learned, remembered, and StackOverflowed about for an eternity. In the case of generators, the PEP 479 rule is easily applied (not hard to search for or to mitigate). In contrast, the "fix" in this case would need to be applied to the *called* functions or their callees, possibly far removed from the map/filter/accumulate call. If that function is in third-party code or in a C-library, then the mitigation would require redesigning the call logic completely (not fun) or to wrap the function in something transforms a StopIteration into a custom exception and then re-catches the custom exception upstream from the higher-order function (also not fun). For anyone who has tested code that is currently working correctly but becomes broken by this proposal, the simplest mitigation will be for them to write their own variants of map/filter/accumulate/dropwhile/takewhile/groupby/starmap that just ignore this proposal and restore the behavior that has been happily in place for ages. Raymond

On 12/12/2014 10:10 PM, Raymond Hettinger wrote:
Completely agreed. While the problem (at least in some views, my own included) is really about how next() treats an empty iterable, the biggest reason against changing how next() works is because of all the correctly written code that would be broken (by "correct" I mean code that already guards against next() raising a StopIteration). Conversely, the biggest reason for changing just generators (again, at least in my view ;) is that when writing a generator it's easy to forget that you are actually mucking about with internals and should be guarding against an unexpected flow control exception occuring, as it were, out of the blue. -- ~Ethan~

On 13 December 2014 at 06:10, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
In contrast, the "fix" in this case would need to be applied to the *called* functions or their callees, possibly far removed from the map/filter/accumulate call. If that function is in third-party code or in a C-library, then the mitigation would require redesigning the call logic completely (not fun) or to wrap the function in something transforms a StopIteration into a custom exception and then re-catches the custom exception upstream from the higher-order function (also not fun).
Is it a common pattern to raise StopIteration from some deeply nested function (that isn't itself a __next__ method) in order to terminate iteration at some far removed level? I think that PEP 479 largely breaks that pattern if it is since it will no longer work if any generators are involved. Oscar

On Fri, Dec 12, 2014 at 10:10 PM, Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
Me too. PEP 479 was a clear win. __next__ methods are a much murkier area and we should not mess with them.
When it comes to map() and filter() and friends the situation is murkier yet. In Python 2, raising StopIteration in the function passed to map/filter was just bubbled out. In Python 3, where these have become lazy, a StopIteration raised by the function terminates the iteration. Was that intentional? I doubt it. Is it now a feature? Maybe (though it would make more sense to use something like takewhile). Could it mask a subtle bug? Probably (but it doesn't sound like a common situation -- map/filter functions are usually small and simple). Should we "fix" it? I don't think so. It's not sufficiently broken, the fix would lead us onto a slippery slope. Enough is enough.
Raymond
-- --Guido van Rossum (python.org/~guido)

On 12/13/2014 3:48 PM, Guido van Rossum wrote:
After reading other responses, even before reading this one, I decided that Alexander Belopolsky's initial comment "Aren't we stepping on a slippery slope here?" was correct, that generator.__next__ and generator functions are a sufficiently unique case to get unique treatment, and that I would withdraw my proposal. [The rest of Guido's post, which I have snipped, further explicates the slipperness.] I am also cognizant that creating incompatibilities between even 3.4 and 3.5 requires strong justification. So I consider this trial balloon popped. My proposal was a response to the problem of *silent breakage* of 'transform iterators' that take as arguments an iterator and transform function and produce a new, transformed sequence. While breaking functions with bad input generally falls under 'consenting adults', a transform function raising StopIteration is special in that it can easily lead to silent rather than noisy breakage. I have two alternate proposals. 1. In the doc entry for StopIteration, explicitly recommend against raising StopIteration outside of __next__ methods on the basis that doing so can lead to silent breakage of iterators. Then writing functions that do so anyway or passing them into iterators will clearly be a 'consenting adults' issue. 2. Make it easier to avoid accidentally leaking StopIteration when using next() by adding an option to have it raise a different exception. I already posted this idea to a different thread. -- Terry Jan Reedy

On Fri, Dec 12, 2014 at 04:14:07PM -0500, Terry Reedy wrote:
The term used by the docs is that it is "broken", which in this context has a technical meaning. An iterator is broken if it fails the rule that once it raises StopIteration, it will continue to always raise StopIteration. Whether or not map() and filter() return a "broken iterator" is completely at the mercy of the user's function. func() can silently break the iterator in many different ways: - it may call os._exit() or os.abort() - it may call threaded code which deadlocks - it may call time.sleep(2**10000) - it may enter an infinite loop to mention just a few. The documentation's promise that map() will return an iterator that "applies function to every item of iterable" is not an unconditional promise. It cannot possibly be, and neither map() nor Python can detect every possible failure condition in advance. It is not Python's responsibility to police that all iterators are non-broken. That is the responsibility of the coder. If you write func() such that it "breaks" map(), then you either have a good reason for doing so, or you are responsible for your own actions. We are all consenting adults here. Let's not complicate things in a futile attempt to prevent people from shooting themselves in the foot. There's a thread on python-dev at the moment decrying the difficulty of writing polylingual Python 2 + 3 code and how some people find it sucks all the joy out of writing Python code. Every backwards incompatible change we add just makes it more difficult to deal with the 2/3 transition. New features are an incentive to upgrade to 3.x. This is not an incentive to upgrade, but it will make it harder to write and understand polylingual iterator code. Gratuitously fixing perceived weaknesses in the iterator protocol which have been there since it was first introduced will, in my opinion, cause more pain than benefit. In another thread, Nick wrote to Oscar: The problem you think PEP 479 is trying to solve *is* the one where the discussion started, but it is *not* the one where it ended when Guido accepted PEP 479. The problems PEP 479 solves are generator specific - they have nothing to do with the iterator protocol in general. Let's leave the iterator protocol alone. -- Steven

On Sat, Dec 13, 2014 at 6:45 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Terry was suggesting fixing builtin iterators, not fixing all iterators or the iterator protocol. It *is* Python's responsibility to make sure the builtin iterators are not broken. It's fine and to be expected if user code raises StopIteration, but map and filter don't handle it correctly and therefore disobey the iterator protocol, which is a bug in map and filter -- not the callback. -- Devin

On Sun, Dec 14, 2014 at 6:02 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
I'm not sure that "fine and to be expected" is necessarily true. If the function given to map() raises StopIteration, I'd say any of these approaches is justifiable: 1) It's an error. Raise something other than StopIteration, so the caller knows something bad happened. (A generator version of map() will do this post-479.) 2) It's to be expected. Consider map() to have now terminated. (In other words, catch it and return.) 3) It's a signal that this one element should not be returned. Suppress it, yield nothing, and go get the next value from the input iterator. Personally, I'd rank them in this order of preference. Consistency with generators is a plus (especially in terms of citing a pure-Python equivalent), and the second and especially third options risk opening up a misfeature - using StopIteration as a weird form of flow control/signalling. But your description implies that option 2 is inherently obvious, which I'm not entirely convinced of; unless this is actually implemented as a generic "this iterator is now exhausted" flag, and next() itself sets and checks this flag, thus automatically enforcing that one StopIteration will result in an infinite stream of StopIterations. ChrisA

On Sat, Dec 13, 2014 at 11:51 AM, Chris Angelico <rosuav@gmail.com> wrote:
I worded myself poorly. It's not fine, it probably signals a bug in the callback (which is what makes #1 a reasonable option). What I mean is that callbacks can return any return value and raise any exception. Doing so shouldn't cause you to break the documented API of your type, including breaking its conformance to the iterator protocol. Anyway, I think we're on the same page, since I agree that your suggestions are justifiable, and you seem to agree that the status quo is not. This is not a "consenting adults"-type issue. map() and filter() don't correctly handle a perfectly plausible case, and as a result they violate their documented API. That is a bug. I'd rather fix the code than fix the docs. I can write a patch implementing Chris's #1, if it would be accepted. I don't think it needs a PEP. -- Devin

On Sun, Dec 14, 2014 at 11:47 PM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
Sure. Then yes, I think we are broadly in agreement.
Sounds fine to me, although without concrete use-cases I can't say it would be a huge practical improvement. It's still a nice theoretical improvement, tightening up a promise. ChrisA

On Sat, Dec 13, 2014, at 14:51, Chris Angelico wrote:
It's worth mentioning that #2 is, more or less,* precisely how both map and a pure-python equivalent behaved before PEP 479. Some people in this discussion considered this expected. Obviously that was not the end of the discussion. *Assuming that the code consuming the outer iterator will not attempt to read it again after a single StopIteration has been raised. A well-behaved iterator will continue to raise StopIteration forever after it has raised it once - an iterator whose StopIteration has been caused by some inner function will not.

On Tue, Dec 16, 2014 at 4:37 AM, <random832@fastmail.us> wrote:
Right, and it's that assumption that's the entire question here. This is buggy behaviour; the question is, is the bug map's problem (because it's failing to behave the way an iterator should), or is it simply a bug in the called function, which shouldn't be raising StopIteration? ChrisA

On 13 December 2014 at 07:14, Terry Reedy <tjreedy@udel.edu> wrote:
No, Oscar's proposal is based on a fundamental misunderstanding of the problem PEP 479 is solving. The PEP solves a generator specific problem with having "Two Ways to Do It" when it comes to terminating generator functions. It does this by way of the relatively straightforward expedient of having the PEP 380 approach completely replace the legacy approach, rather than retaining the status quo of offering both in parallel (my attempt at explaining this perspective in more detail: https://mail.python.org/pipermail/python-ideas/2014-December/030371.html). __next__ methods in general don't have this problem - there's only one way to signal termination from a __next__ method implementation, and that's raising StopIteration. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (13)
-
Alexander Belopolsky
-
Chris Angelico
-
Chris Barker
-
Devin Jeanpierre
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Nick Coghlan
-
Oscar Benjamin
-
random832@fastmail.us
-
Raymond Hettinger
-
Steven D'Aprano
-
Terry Reedy