Change how Generator Expressions handle StopIteration

Hi, I would like to purpose that generator expressions will not catch StopIteration exception, if this exception did not come from the iterated object's __next__ function specifically. So generator expressions will be able to raise StopIteration by calculating the current value of the Generator. Here is an example of a use-case : def izip(*args): iters = [iter(obj) for obj in args] while True: yield tuple(next(it) for it in iters) a = izip([1,2],[3,4]) print(next(a),next(a),next(a)) #Currently prints : (1, 3) (2, 4) () list(izip([1,2],[3,4])) #Currently never returns Even thought this is the PEP described behaviour, I think this is an unwanted behaviour. I think Generator Expressions should work like List Comprehension in that sense: def iizip(*args): iters = [iter(obj) for obj in args] while True: yield tuple([next(it) for it in iters]) tuple(iizip([1,2],[3,4])) #Returns [(1, 3), (2, 4)]

I think you're on to something. But I think both your examples have a problem, even though your second one "works". If we weren't forced by backward compatibility I would have made it much harder for StopIteration to "leak out". Currently a generator can either return or raise StopIteration to signal it is done, but I think it would have been better if StopIteration was treated as some kind of error in this case. Basically I think any time a StopIteration isn't caught by a for-loop or an explicit try/except StopIteraton, I feel there is a bug in the program, or at least it is hard to debug. I'm afraid that ship has sailed, though... On Sat, Nov 1, 2014 at 7:56 AM, yotam vaknin <tomirendo@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 11/1/2014 12:50 PM, Guido van Rossum wrote:
I think you're on to something. But I think both your examples have a problem, even though your second one "works".
Both versions are buggy in that iizip() yields () infinitely, while zip() yields nothing. Fixes below.
This would require some sort of additional special casing of StopIteration that we do not have now. Currently, it is limited to 'for' loops expecting and catching StopIteration as a signal to stop iterating. That is rather easy to understand.
Outside of generator functions (and expressions), I agree as I cannot think of an exception when it is not. This has come up on Python list.
or at least it is hard to debug.
Code within generator functions is different. Writing "raise StopIteration" instead of "return" is mostly a waste of keystrokes. As for next(it), StopIteration should usually propagate, as with an explicit raise and not be caught. The code below that 'works' (when it does work), works because the StopIteration from next(it) (when there is at least one) propagates to the list comp, which lets it pass to the generator, which lets it pass to the generator user.
For the purpose of your example, all instances of StopIteration are the same and might as well be the same instance. Since to my understanding generators and g.e.s already do not catch the StopIterations you say you want not caught, and since you need for it to not be caught in the code below, I do not understand exactly what you propose.
I cannot understand this.
Better test code that avoid infinite looping: a = izip([1,2],[3,4]) for i in range(3): print(next(a)) One the third loop, the above prints (), while the below prints a traceback. With a = izip(), both print () 3 times. The problem is that when next(it) raises, you want the StopIteration instance propagated (not immediately caught), so that the generator-using code knows that the generator is exhausted. But the tuple call catches it first, so that, in combination with 'while True', the user never sees StopIteration A partial solution is to provoke StopIteration before calling tuple, so that it does propagate. That is what the list comp below does. But if args is empty, so is iters, and there is no next(it) to ever raise. For a complete solution that imitates zip and does not require an otherwise useless temporary list, replace the loop with this: while True: t = tuple(next(it) for it in iters) if not t: return yield t
Even thought this is the PEP described behaviour, I think this is an unwanted behaviour.
Not if you think carefully about what you want to happen when next(it) raises. I think generators and generators expressions should be left alone.
This could be fixed with 'if not iters: return' as the second line. Replacing [genexp] with list(genexp) does not work because the latter, unlike the former, catches StopIteration. This is proof that the two are not exactly equivalent, and the such behavior difference I know of (excluding introspection, such as with trace). -- Terry Jan Reedy

On 2 November 2014 02:50, Guido van Rossum <guido@python.org> wrote:
The closest existing example of this kind of generator instance specific StopIteration handling that I can think of is the special case handling of StopIteration in contexlib._GeneratorContextManager.__exit__() (https://hg.python.org/cpython/file/3.4/Lib/contextlib.py#l63). There, the exception handling differentiates between "a specific StopIteration instance that we just threw into the subgenerator" (which it will allow to propagate) and "any other StopIteration instance, which indicates that the wrapped generator iterator terminated as expected" (which it will suppress). We had that wrong initially - if I recall correctly, it was PJE that noticed the problem before 2.5 was released. However, the only reason we were able to make it work is that we knew the exact identity of the exception we were throwing in, rather than just its type - we don't have that luxury in the general case. Getting back to the behaviour that prompted the thread, like a lot of exception handling quirks, it gets back to being very careful about the scope of exception handlers. In this case, the "next(it)" call is inside a generator expression, and hence inside the scope of the expression's StopIteration handling. By contrast, the comprehension version doesn't *have* any implicit exception handling, so the StopIteration escapes to terminate the containing generator. In terms of changing the behaviour of generator expressions to allow other StopIteration instances to propagate, I believe I do see one possible way to do it that limits the degree of backwards incompatibility. Firstly, you'd need to add a general purpose capability to generator iterators: def _set_default_exception(exc): """Supply a specific StopIteration instance to raise when the generator frame returns None""" ... Normally, when the generator execution falls off the end of the frame by returning, the interpreter raises StopIteration if the result is None, or StopIteration(result) if the result is not None. With the new method, you could set a specific instance to be raised when the underlying result of the frame is None. (Side note: "return" and "raise StopIteration" in a generator function aren't *exactly* the same, as only the former relies on the return->raise conversion supplied by the surrounding generator iterator object) That part would be entirely backwards compatible, and would allow you to distinguish whether calling "next", "send" or "throw" on any generator threw StopIteration because the underlying frame returned None (by checking if the StopIteration instance was the one you configured to be raised on a None result), or because it either returned a non-None value or else something running inside that frame threw StopIteration. The backwards incompatible part would be to then also change generator expressions to set a specific StopIteration instance to be raised when the underlying frame returned, and allow all other StopIteration instances to escape, just as contextlib._GeneratorContextManager.__exit__ allows StopIteration instances thrown from the body of the with statement to escape. I think the end result of such a change would definitely be less surprising, as it would make generator expressions behave more like the corresponding comprehensions, and eliminate a hidden infinite loop bug. However, I'm not sure if it's *sufficiently* less surprising to be worth changing - especially since it would mean incurring a small amount of additional runtime overhead for each generator expression. Regards, Nick. P.S. As additional background on the current difference in behaviour between list comprehensions and generator expressions, that has its roots in the same idiosyncrasy where putting a yield expression inside a comprehension actually *turns it into a generator expression*. Comprehensions are full closures, but they don't contain a yield expression, so you get a normal function, which the interpreter then calls. The interpreter doesn't actually do anything particularly special to make a generator expression instead - it just implicitly inserts a yield expression into the closure, which then automatically makes it a generator function instead of a normal one. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2 November 2014 19:53, Nick Coghlan <ncoghlan@gmail.com> wrote:
Bah, I should have fully read Terry's reply before responding. He's right, it's the tuple call that's suppressing the exception, not the generator expression itself. That changes the possible solution, by tweaking it to be an optional extension to the iterator protocol, allowing iterators to make the terminating exception configurable. def __setiterexc__(exc): """Specify the exception instance to raise when the iterator is exhausted""" ... Iterator consumers (like tuple) could then check for that method and use it to set a specific StopIteration instance, allowing all others to escape. I believe actually doing this would be adding too much complexity for too little gain, but it *is* possible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

This is related to the fact that, although the docs imply otherwise, [COMP] isn't exactly equivalent to list(COMP), because of cases like: def ensure_positive(x): if x<=0: raise StopIteration return True eggs = list(x for x in spam if raise_on_negative(x)) def ensure_positive(x); if x<=0: raise StopIteration return x eggs = list(ensure_positive(x) for x in spam) In both cases, this acts like a "takewhile": eggs ends up as a list of the initial positive values, and the first non-positive value is consumed and discarded. But if you do the same thing with a list comprehension, the comprehension is aborted by the StopIteration, and eggs never gets set (although the same values are consumed from spam, of course). IIRC, you asked me what the performance costs would be of changing listcomps to match, and for a trivial comp it worked out to be about 40% for the naive solution (build a genexpr, call it, call list) and about 20% with specially-optimized bytecode. So everyone agreed that even if this is a bug, that would be too much of a cost for too small of a fix. Of course here the issue is almost the opposite. But they're clearly related; how different comprehensions "leak" exceptions differs. Sent from a random iPhone On Nov 2, 2014, at 2:15, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/2/2014 2:50 PM, Andrew Barnert wrote:
This is related to the fact that, although the docs imply otherwise, [COMP] isn't exactly equivalent to list(COMP),
That purported equivalence is a common meme, which I may have helped spread. If it is implied in the doc, it should be changed. Before I answered on this thread yesterday, I looked for such an implication in the Language Reference (though not the Tutorial), and only found this carefully written description in the expressions chapter. "In this case, the elements of the new container are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce an element each time the innermost block is reached. Note that the comprehension is executed in a separate scope," IE, a comprehension is equivalent to the result of for and if statements in a separate Python function that first initializes a collection object (list, set, or dict) and augments the collection in the innermost scope with the element produced (which is a key,value pair for dicts). Such an equivalent function would not catch any exception raised by the innermost element expression. On the other hand, collection initializers, such as list or tuple, specifically catch StopIteration when fed an iterable. Hence the two cannot be equivalent when the element expression raises StopIteration. -- Terry Jan Reedy

On Nov 2, 2014, at 13:00, Terry Reedy <tjreedy@udel.edu> wrote:
We looked through this last year and decided nothing needed to be changed in the docs, so I doubt it's worth repeating that effort. IIRC, the tutorial may have been confusing in the past but wasn't as of 3.3, and the only place that might confuse anyone was the what's new in 3.0, which is generally only changed to add notes about things which were un-/re-changed (like u"" strings). But again, even if I'm remembering wrong, I don't think it matters. All that being said, I don't really love the reference docs here. Neither 6.2.8 not anything else explicitly says what the semantics are. It's a pretty obvious guess that the syntax is interpreted the same as for comprehensions (as in 6.2.4), and that the values yielded are those that would be used to produce elements. But the docs don't actually say that.

On Sun, Nov 2, 2014 at 1:00 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I may have started it. I was aware of the non-equivalence (only mostly-equivalence) in Python 2 and I wanted to make then identical in Python 3 -- having one construct being exactly equivalent to another reduce the amount of explaining needed. Unfortunately, people had started to depend on the (in my *current* opinion deplorable) behavior of generator expressions in the face of StopIteration thrown by arbitrary parts of the expression or condition, and the equivalence is still imperfect. At least the variable leakage has been fixed. I know that when we first introduced generators (not generator expressions) I was in favor of interpreting arbitrary things that raise StopIteration in a generator to cause the generator to terminate just as if it had decided to stop (i.e. 'return' or falling of the end), because I thought there were some useful patterns that could be written more compactly this way -- in particular, the pattern where a generator iterates over another iterator by calling next() on it, does some processing on the value thus produced, and then yielding the processed value (or not), and where the logical response to a StopIteration from the inner iterator is to exit the generator. For example: def only_positive(it): while True: x = next(it) if x > 0: yield x This *particular* example is much better written as: def only_positive(x): for x in it: if x > 0: yield x but the idea was that there might be variants where being constrained by a single for-loop would make the code less elegant if you had to catch the StopIteration and explicitly exit the generator. However, I don't think this idea has panned out. I haven't done a survey, but I have a feeling that in most cases where an explicit next() call is used (as opposed to a for-loop) there's a try/except Stopiteration around it, and a fair amount if time is wasted debugging situations where a StopIteration unexpectedly escapes and silently interrupts some loop over an unrelated generator (instead of loudly bubbling up to the top and causing a traceback, which would be more debuggable). And the use case of raising StopIteration from a condition used in a generator expression is iffy at best (it makes the condition function hard to use in other contexts, and it calls to attention the difference between generators and comprehensions). So I will go out on a limb here and say that this was a mistake and if we can think of easing the transitional pain it would be a good thing to fix this eventually. -- --Guido van Rossum (python.org/~guido)

On 3 November 2014 13:01, Guido van Rossum <guido@python.org> wrote:
I think I'm guilty as well - when I was working on the Python 3 changes, getting the *scoping* behaviour to be identical between comprehensions and generator expressions was one of the key objectives, so I regularly described it as making "[x for x in seq]" equivalent to "list(x for x in seq)". I unfortunately didn't notice the remaining exception handling differences at the time, or we might have been able to do something about it for 3.0 :(
Having had to do the dance to work around the current behaviour in contextlib, I'm inclined to agree - there's a significant semantic difference between the "this iterable just terminated" StopIteration, and the "something threw StopIteration and nobody caught it", and the current model hides it. However, I also see potentially significant backwards compatibility problems when it comes to helper functions that throw StopIteration to terminate the calling generator - there would likely need to be some kind of thread local state and a helper "iterexit()" builtin and "PyIter_Exit()" C API to call instead of raising StopIteration directly. Making such a change would involve a lot of code churn just to phase out a relatively obscure issue that mostly just makes certain bugs harder to diagnose (as was the case with the generator expression based izip implementation in this thread), rather than directly causing bugs in its own right. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 2, 2014 at 8:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's what I was afraid of. Can you point me to an example of code that depends on this that isn't trivial like Andrew Barnert's ensure_positive() example? I think that particular example, and the category it represents, are excessive cleverness that abuse the feature under discussion -- but you sound like you have helpers for context managers that couldn't be easily dismissed like that.
Maybe. But the real-life version of that bug can be *really* hard to find, and that's usually the kind of thing we'd like to fix. FWIW the implementation of my proposal is easy to describe (which the Zen approves of): when a StopIteration leaves a frame, replace it with some other exception (either a new, custom one or maybe just RuntimeError), chaining the original StopIteration. It's the consequences that are hard to survey and describe in this case (as they affect subtle code depending on the current semantics). -- --Guido van Rossum (python.org/~guido)

On 3 November 2014 15:59, Guido van Rossum <guido@python.org> wrote:
Sorry, I didn't mean to give that impression. I'm in a similar situation to you in that regard - any specific examples I can think of trip my "that's too obscure to be maintainable" alarm (if there is a reasonable use case, I'd expect it to involve full generators, rather than generator expressions). The code in contextlib relies on the way *generators* handle StopIteration, and if understand your proposal correctly, that would remain unchanged - only ordinary function calls would convert StopIteration to a different exception type, preserving the functional equivalence of a generator returning, raising StopIteration, or having StopIteration thrown into a yield point (it's that last one that contextlib relies on).
That's far more elegant than the stateful possibilities I was considering. So generators would continue to leave StopIteration untouched (preserving the equivalence between returning from the frame and explicitly raising StopIteration from the generator body), and only ordinary function invocations would gain the StopIteration -> UnexpectedStopIteration translation (assuming we went with a new exception type)?
It's the consequences that are hard to survey and describe in this case (as they affect subtle code depending on the current semantics).
Aye. I'm reasonably OK with the notion of breaking overly clever (and hence hard to follow) generator expressions, but I'm a little more nervous about any change that means that factoring out "raise StopIteration" in a full generator function would stop working. That said, such a change would bring generator functions fully into line with the "flow control should be locally visible" principle that guided both the with statement design and asyncio - only a local return or raise statement could gracefully terminate the generator, StopIteration from a nested function call would always escape as the new exception type. If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()". Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Guido van Rossum <guido@python.org> writes:
The pure Python implementation of itertools.groupby() provided in its docs [1] uses next() as the helper function that terminates the calling generator by raising StopIteration [1]: https://docs.python.org/3/library/itertools.html#itertools.groupby Here's a simplified example: from functools import partial def groupby(iterable): """ >>> ' '.join(k for k, g in groupby('AAAABBBCCDAABBB')) 'A B C D A B' >>> ' '.join(''.join(g) for k, g in groupby('AAAABBBCCDAABBB')) 'AAAA BBB CC D AA BBB' """ next_value = partial(next, iter(iterable)) def yield_same(group_value_): # generate group values nonlocal value while value == group_value_: yield value value = next_value() # exit on StopIteration group_value = value = object() while True: while value == group_value: # discard unconsumed values value = next_value() # exit on StopIteration group_value = value yield group_value, yield_same(group_value) The alternative is to return a sentinel from next(): def groupby_done(iterable): done = object() # sentinel next_value = partial(next, iter(iterable), done) def yield_same(group_value_): # generate group values nonlocal value while value == group_value_: yield value value = next_value() if value is done: return group_value = value = object() while value is not done: while value == group_value: # discard unconsumed values value = next_value() if value is done: return group_value = value yield group_value, yield_same(group_value) The first code example exploits the fact that `next(it)` continues to raise StopIteration on subsequent calls. The second code example has to propagate up the stack the termination condition manually (`while True` is replace with `while value is not done`). -- Akira

On Nov 2, 2014, at 21:59, Guido van Rossum <guido@python.org> wrote:
The category is usually represented by the even more trivial and even more abusive example: (prime if prime<100 else throw(StopIteration) for prime in primes) I do see these, mostly on places like StackOverflow, where someone was shown this "cool trick" by someone else, used it without understanding it, and now has no idea how to debug his code. (However, a few people on this list suggested it as an alternative to adding some kind of "syntactic takewhile" to the language, so it's possible not everyone sees it as abuse, even though I think you and others called it abuse back then as well.) Anyway, I agree that explicitly disallowing it would make the language simpler, eliminate more bugs than useful idioms, and possibly open the door to other improvements. But if you can't justify this abuse as being actually illegal by a reading of the docs in 3.0-3.4, and people are potentially using it in real code, wouldn't that require a period of deprecation before it can be broken?

On Mon, Nov 3, 2014 at 1:38 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have to apologize, I pretty much had this backwards. What I should have said is that a generator should always be terminated by a return or falling off the end, and if StopIteration is raised in the generator (either by an explicit raise or raised by something it calls) and not caught by an except clause in the same generator, it should be turned into something else, so it bubbles out as something that keeps getting bubbled up rather than silently ending a for-loop. OTOH StopIteration raised by a non-generator function should not be mutated. I'm sorry if this invalidates Nick's endorsement of the proposal. I definitely see this as a serious backward incompatibility: no matter how often it leads to buggy or obfuscated code, it's how things have always worked. Regarding Akira Li's examples of groupby(), unfortunately I find both versions inscrutable -- on a casual inspection I have no idea what goes on. I would have to think about how I would write groupby() myself (but I'm pretty sure I wouldn't us functools.partial(). :-) -- --Guido van Rossum (python.org/~guido)

On 11/03/2014 06:29 PM, Guido van Rossum wrote:
I agree, and just happened to try. After several tries, it's tricky, I come to the conclusion it fits the pattern of peeking at the first value of the iterater best. It seems to be a fairly common pattern and hard to get right with iterators. For what it's worth here is what I came up with. :-) class PeekIter: """ Create an iterator which allows you to peek() at the next item to be yielded. It also stores the exception and re-raises it on the next next() call. """ def __init__(self, iterable): self._exn = None self._it = iter(iterable) self._peek = next(iter(iterable)) def peek(self): return self._peek def ok(self): #True if no exception occured when setting self._peek return self._exn == None def __iter__(self): return iter(self) def __next__(self): if self._exn != None: raise self._exn t = self._peek try: self._peek = next(self._it) except Exception as exn: self._exn = exn self._peek = None return t def group_by(iterable): """ Yield (key, (group))'s of like values from an iterator. """ itr = PeekIter(iterable) def group(it): # Yield only like values. k = it.peek() while it.peek() == k: yield next(it) while itr.ok(): yield itr.peek(), [x for x in group(itr)] print(' '.join(''.join(k) for k, v in group_by('AAAABBBCCDAABBB'))) # 'A B C D A B' print(' '.join(''.join(g) for k, g in group_by('AAAABBBCCDAABBB'))) # 'AAAA BBB CC D AA BBB'

Just a very minor correction, list(group(itr) instead of list comp. ... On 11/03/2014 09:17 PM, Ron Adam wrote:
def group_by(iterable): """ Yield (key, (group))'s of like values from an iterator. """ def group(it): # Yield only like values. k = it.peek() while it.peek() == k: yield next(it) itr = PeekIter(iterable) while itr.ok(): yield itr.peek(), list(group(itr)) Cheers, Ron

A bad edit got me... Don't mean to side track the Stopiteration discussion, but also don't want anyone having to debug my mistakes. In the PeekIter __init__ class... On 11/03/2014 09:17 PM, Ron Adam wrote:
self._peek = next(iter(iterable))
to... self._peek = next(self._it) It was yielding an extra value on the first group. As for the handling of the StopIteration, Maybe adding a StopGenerator exception could ease the way. It could be guaranteed to not to escape the generator it's raised in, while a StopIteration could propagate out until it caught by a loop. Also StopGenerator(value) could be equivalent to return value. The two for now at least overlap. StopIterator(value) would be caught and converted. While a bare StopIterator would propogate out. The new exception would make for clearor and easier to understand code in coroutines I think. Another aspect of StopGenerator is it would escape inner loops of the generator it's raised in. Like return does. And so you throw a StopGenerator exception into generator who's yield is nested in multiple loops to stop it. Of course all bets are off until it's actually tried I think. Just because it sound good here, (or to me at the moment), doesn't mean it will work. Does it. :-) Cheers, Ron

On 4 November 2014 00:29, Guido van Rossum <guido@python.org> wrote:
It's worth noting that the functools.partial doesn't really do anything. Just changing next_value = partial(next, iter(iterable)) to iterator = iter(iterable) and "next_value()" to "next(iterator)" gets rid of it. I would have tackled it by letting the inner iterator set a "finished" flag and I would have exhausted the iterable by iterating over it: def groupby(iterable): # Make sure this is one-pass iterator = iter(iterable) finished = False # Yields a group def yield_group(): nonlocal finished, group_key # This was taken off the iterator # by the previous group yield group_key for item in iterator: if item != group_key: # Set up next group group_key = item return yield item # No more items in iterator finished = True # This is always the head of the next # or current group group_key = next(iterator) while not finished: group = yield_group() yield group_key, group # Make sure the iterator is exhausted for _ in group: pass # group_key will now be the head of the next group This does have a key difference. Whereas with groupby you have the confusing property that from itertools import groupby grps = groupby("|||---|||") a = next(grps) b = next(grps) c = next(grps) list(a[1]) #>>> ['|', '|', '|'] with mine this does not happen: grps = groupby("|||---|||") grps = groupby("|||---|||") a = next(grps) b = next(grps) c = next(grps) list(a[1]) #>>> [] Was this an oversight in the original design or is this actually desired? I would guess it's an oversight.

I guess implementing groupby() would be a good interview question. :-) Can we get back to the issue at hand, which is whether and how we can change the behavior of StopIteraton to be less error-prone? On Mon, Nov 3, 2014 at 8:10 PM, Joshua Landau <joshua@landau.ws> wrote:
-- --Guido van Rossum (python.org/~guido)

Firstly, sorry for my buggy code that began all this mess. But what about adding a named parameter to the next function, specifying an exception to be raised on StopIteration, allowing it to propagate? ב-4 בנוב 2014, בשעה 06:49, Guido van Rossum <guido@python.org> כתב/ה:

On 11/4/2014 6:12 PM, Yotam Vaknin wrote:
There is already a 'default' option, so the two would have to be mutually exclusive. "next(iterator[, default]) Retrieve the next item from the iterator by calling its __next__() method. If default is given, it is returned if the iterator is exhausted, otherwise StopIteration is raised." I believe either adding an alternate exception, or raising the default if it is an exception, has been discussed before on this list. In cannot remember details other than the obvious fact that nothing was changed. -- Terry Jan Reedy

On 5/11/2014 1:55 p.m., Terry Reedy wrote:
If it were a keyword-only argument, there wouldn't be any conflict. But I'm not sure I like the idea of adding more complexities to the iterator protocol. Anything that changes the signature of __next__ or __iter__ would be a bad idea, since all existing iterables would need to be updated to take it into account. If this is is to be done, it might be better to add a new optional dunder method, so that existing iterables would continue to work. -- Greg

On 4 Nov 2014 10:31, "Guido van Rossum" <guido@python.org> wrote:
On Mon, Nov 3, 2014 at 1:38 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>
wrote:
I actually thought this was what you meant originally, and while it *would* require changes to contextlib, they're fairly minor: the current check for StopIteration would change to catch "UnexpectedStopIteration" instead, and the exception identity check would look at __cause__ rather than directly at the caught exception.
Aye, breaking the equivalence between "return" and "raise StopIteration" is pretty major. I'm not even sure a plausible transition plan is possible, as at least contextlib would trigger any warning we might issue. Regards, Nick.

On 6 Nov 2014 00:09, "Nick Coghlan" <ncoghlan@gmail.com> wrote: than directly at the caught exception.
I definitely see this as a serious backward incompatibility: no matter
how often it leads to buggy or obfuscated code, it's how things have always worked.
Aye, breaking the equivalence between "return" and "raise StopIteration"
is pretty major.
I'm not even sure a plausible transition plan is possible, as at least
contextlib would trigger any warning we might issue. And having said that... what if we introduced UnexpectedStopIteration but initially made it a subclass of StopIteration? We could issue a deprecation warning whenever we triggered the StopIteration -> UnexpectedStopIteration conversion, pointing out that at some point in the future (3.6? 3.7?), UnexpectedStopIteration will no longer be a subclass of StopIteration (perhaps becoming a subclass of RuntimeError instead?). contextlib could avoid the warning by preconstructing a suitable UnexpectedStopIteration instance and throwing *that* into the generator, rather than throwing in a StopIteration raised from the with statement body. Regards, Nick.
Regards, Nick.

On Wed, Nov 5, 2014 at 5:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And having said that... what if we introduced UnexpectedStopIteration but initially made it a subclass of StopIteration?
Exactly! -- Juancarlo *Añez*

On 11/05/2014 03:47 PM, Nick Coghlan wrote:
I really don't like the name Unexpected anything. It makes me think of blue screens and page faults. :-/ I'm also not sure how it's supposed to work or where it should come from. As near as I can tell these two examples below are equivalent. I think the thing that needs to be avoided is the case of the endless loop. It would be better to let the Exception be noisy. How would the second example be changed in order to do that? Or is there some other thing that needs to be fixed? Cheers, Ron def izip(*args): iters = [iter(obj) for obj in args] while True: yield list(next(it) for it in iters) #StopIteration suppressed #by list comprehension. #resulting in empty lists. #While Loop never exits. print("never printed") a = izip([1,2],[3,4]) print(next(a),next(a),next(a)) # (1, 3) (2, 4) () #list(izip([1,2],[3,4])) #Currently never returns def izip2(*args): iters = [iter(obj) for obj in args] while True: L = [] for it in iters: try: obj = next(it) # StopIteration suppressed here. except StopIteration: break # exit for-loop # "return" instead of "break"? L.append(obj) yield L # While Loop never exits. print("never printed") a = izip2([5,6],[7,8]) print(next(a),next(a),next(a)) # (5, 7) (6, 8) () list(izip2([5,6],[7,8])) #Currently never returns # Not only doesn't it exit, but it's building # an endless list of empty lists!

On Thu, Nov 06, 2014 at 07:47:09AM +1000, Nick Coghlan wrote:
I'm sorry, I have been trying to follow this thread, but there have been too many wrong turns and side-tracks for me to keep it straight. What is the problem this is supposed to solve? Is it just that list (and set and dict) comprehensions treat StopIteration differently than generator expressions? That is, that [expr for x in iterator] list(expr for x in iterator) are not exactly equivalent, if expr raises StopIteration. If so, it seems to me that you're adding a lot of conceptual baggage and complication for very little benefit, and this will probably confuse people far more than the current situation does. The different treatment of StopIteration in generator expressions and list comprehensions does not seem to be a problem for people in practice, judging by the python-list and tutor mailing lists. The current situation is simple to learn and understand: (1) Generator expressions *emit* StopIteration when they are done: py> next(iter([])) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration (2) Functions such as tuple, list, set, dict *absorb* StopIteration: py> list(iter([])) [] py> it = iter([]) py> list(next(it) for y in range(1000)) [] For-loops do the same, if StopIteration is raised in the "for x in iterable" header. That's how it knows the loop is done. The "for" part of a comprehension is the same. (3) But raising StopIteration in the expression part (or if part) of a comprehension does not absord the exception, it is treated like any other exception: py> [next(iter([])) for y in range(1000)] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> StopIteration If that is surprising to anyone, I suggest it is because they haven't considered what happens when you raise StopIteration in the body of a for-loop: py> for y in range(1000): ... next(iter([])) ... Traceback (most recent call last): File "<stdin>", line 2, in <module> StopIteration To me, the status quo is consistent, understandable, and predictable. In contrast, you have: - a solution to something which I'm not sure is even a problem that needs solving; - but if it does, the solution seems quite magical, complicated, and hard to understand; - it is unclear (to me) under what circumstances StopIteration will be automatically converted to UnexpectedStopIteration; - and it seems to me that it will lead to surprising behaviour when people deliberately raise StopIteration only to have it mysteriously turn into a different exception, but only sometimes. It seems to me that if the difference between comprehensions and generator expressions really is a problem that needs solving, that the best way to proceed is using the __future__ mechanism. 3.5 could introduce from __future__ comprehensions_absorb_stopiteration and then 3.6 or 3.7 could make it the default behaviour. We're still breaking backwards compatibility, but at least we're doing it cleanly, without magic (well, except the __future__ magic, but that's well-known and acceptible magic). There will be a transition period during which people can choose to keep the old behaviour or the new, and then we transition to the new behaviour. This automatic transformation of some StopIterations into something else seems like it will be worse than the problem it is trying to fix. For what it is worth, I'm a strong -1 on changing the behaviour of comprehensions at all, but if we must change it in a backwards incompatible way, +1 on __future__ and -1 on changing the exceptions to a different exception. -- Steven

On 11/06/2014 04:15 AM, Steven D'Aprano wrote:
It's the "when they are done" part that's having the issue. In some cases, they are never completely done because the StopIteration Error is handled incorrectly. The reason this doesn't show up more is that most iterators only have one loop, which is exited and the generator ends causing a different StopIteration to be emitted from the one that ends the loop.
Right, sometimes this doesn't happen.
I think this part is working as it should.
This is not always working. When a StopIteration is raised from the next call in the example that started the thread, it's getting replaced with the equivalent of a break, so it's never exiting the generator completely. I think it needs to be replaced with the equivalent of return, which will end the generator, and cause a StopIteration to be emitted. (As you describe here.) It looks like it's a bug in the C code for generator expressions. Since it only effects generator expressions that have iterators with nested loops, it may be fixible. Cheers, Ron

On 11/6/2014 5:15 AM, Steven D'Aprano wrote:
'iter([])' is a list_iterator, not a generator expression. Here is the example I think you wanted.
Which is the Python translation of the comprehension.
I agree. -- Terry Jan Reedy

I believe the issue is that, in some situations, the generator is absorbing StopIteration when it shouldn't. -- ~Ethan~

TL;DR :-( The crux of the issue is that sometimes the StopIteration raised by an explicit next() call is "just" an exception, but sometimes it terminates some controlling loop. Most precedents clearly establish it as "just" an exception, e.g. (these are all nonsense) it = iter([1, 2, 3, -1]) for x in it: if x < 0: next(it) it = iter([1, 2, 3]) [x for x in it if next(it) < 0] Both these raise StopIteration and report a traceback. But generators and generator expressions are different. In a generator, any StopIteration that isn't caught by an exception handler in the body of the generator implicitly terminates the iteration, just like "return" or falling off the end. This was an explicitly designed feature, but I don't think has worked out very well, given that more often than not, a StopIteration that "escapes" from some expression is a bug. And because generator expressions are implemented using generators (duh), the following returns [] instead of raising StopIteration: it = iter([1, 2, 3]) list(x for x in it if next(it) < 0) This is confusing because it breaks the (intended) equivalence between list(<genexp>) and [<genexp>] (even though we refer to the latter as a comprehension, the syntax inside the [] is the same as a generator expression. If I had had the right foresight, I would have made it an error to terminate a generator with a StopIteration, probably by raising another exception chained to the StopIteration (so the traceback shows the place where the StopIteration escaped). The question at hand is if we can fix this post-hoc, using clever tricks and (of course) a deprecation period. --Guido On Thu, Nov 6, 2014 at 2:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)

On Thu, Nov 06, 2014 at 08:01:32PM -0430, Juancarlo Añez wrote:
What about it? There are 19 lines in the Zen, and I don't think any of them are particularly relevent here. Which line, or lines, were you thinking of? -- Steven

On Fri, Nov 7, 2014 at 5:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:
You asked what was the point in fixing the current behavior regarding StopIteration: - - Beautiful is better than ugly. - Explicit is better than implicit. - Simple is better than complex. - Flat is better than nested. - Readability counts. - Special cases aren't special enough to break the rules. - Errors should never pass silently. - In the face of ambiguity, refuse the temptation to guess. - There should be one-- and preferably only one --obvious way to do it. - Now is better than never. - If the implementation is hard to explain, it's a bad idea. Basically, that some of the behaviours Guido mentioned are unexpected consequences of a lack of foresight in the implementation of generator expressions. It is not in the Zen of Python to leave it as is just because some existing code may be relying on the odd behavior. Just to be clear, that StopIteration will cancel more than one iterator is an unintended behavior that is something difficult to explain, is of questionable usefulness, and is the source of difficult to catch bugs. Cheers, -- Juancarlo *Añez*

On Fri, Nov 7, 2014 at 11:57 PM, Juancarlo Añez <apalala@gmail.com> wrote:
It is not in the Zen of Python to leave it as is just because some existing code may be relying on the odd behavior.
Actually, it is. Practicality beats purity. :) ChrisA

On 7 November 2014 07:45, Antoine Pitrou <solipsis@pitrou.net> wrote:
It's not about people relying on the current behaviour (it can't be, since we're talking about *changing* that behaviour), it's about "Errors should never pass silently". That is, the problematic cases that (at least arguably) may be worth fixing are those where: 1. StopIteration escapes from an expression (Error!) 2. Instead of causing a traceback, it terminates a containing generator (Passing silently!) As asyncio coroutines become more popular, I predict some serious head scratching from StopIteration escaping an asynchronous operation and getting thrown into a coroutine, which then terminates with a "return None" rather than propagating the exception as you might otherwise expect. The problem with this particular style of bug is that the only trace it leaves is a generator iterator that terminates earlier than expected - there's no traceback, log message, or any other indication of where something strange may be happening. Consider the following, from the original post in the thread: def izip(*args): iters = [iter(obj) for obj in args] while True: yield tuple([next(it) for it in iters]) The current behaviour of that construct is that, as soon as one of the iterators is empty: 1. next(it) throws StopIteration 2. the list comprehension unwinds the frame, and allows the exception to propagate 3. the generator iterator unwinds the frame, and allows the exception to propagate 4. the code invoking the iterator sees StopIteration and assumes iteration is complete If you switch to the generator expression version instead, the flow control becomes: 1. next(it) throws StopIteration 2. the generator expression unwinds the frame, and allows the exception to propagate 3. the iteration inside the tuple constructor sees StopIteration and halts 4. the generator iterator never terminates In that code, "next(it)" is a flow control operation akin to break (it terminates the nearest enclosing generator iterator, just as break terminates the nearest enclosing loop), but it's incredibly unclear that this is the case - there's no local indication that it may raise StopIteration, you need to "just know" that raising StopIteration is a possibility. Guido's suggestion is to consider looking for a viable way to break the equivalence between "return" and "raise StopIteration" in generator iterators - that way, the only way for the above code to work would be through a more explicit version that clearly tracks the flow control. Option 1 would be to assume we use a new exception, and are OK with folks catching it explicitly from __future__ import explicit_generator_return def izip(*args): iters = [iter(obj) for obj in args] while True: try: t = tuple(next(it) for it in iters) except UncaughtStopIteration: return # One of the iterators has been exhausted yield t Option 2 would be to assume the new exception is something generic like RuntimeError, requiring the inner loop to be converted to statement form: def izip(*args): iters = [iter(obj) for obj in args] while True: entry = [] for it in iters: try: item = next(it) except StopIteration: return # One of the iterators has been exhausted entry.append(item) yield tuple(entry) With option 2, you can also still rely on the fact that list comprehensions don't create a generator frame: def izip(*args): iters = [iter(obj) for obj in args] while True: try: entry = [next(it) for it in iters] except StopIteration: return # One of the iterators has been exhausted yield tuple(entry) The upside of the option 2 spellings is they'll work on all currently supported versions of Python, while the downside is the extra object construction they have to do if you want to yield something other than a list. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 11/11/2014 06:21 AM, Nick Coghlan wrote:
When I was experimenting with this earlier, I needed the try-except to catch the StopIteration exception in order to do a "break". Which gave me the current behaviour of the generator expression being discussed. Replacing break with return, as above, gave the expected behaviour, but also just removing the try-except and letting the StopIteration propagate out, worked as well. That is, StopIteration(None) is equivalent to "return None" in the context above. Can you point me to the source file that implements generator expression byte code or C code? I wanted to look at that to see what was actually going on, but it looks like it may be a combination of a regular generator with a condition at some point to handle it slightly different. Cheers, Ron

On 12 November 2014 02:38, Ron Adam <ron3200@gmail.com> wrote:
When a StopIteration escapes from a generator frame, it's currently just propagated without modification. When the generator frame *returns*, the generator object converts that to raising StopIteration: https://hg.python.org/cpython/file/30a6c74ad87f/Objects/genobject.c#l117 An implementation of Guido's __future__ import idea would likely involve setting a flag on generator iterators when they're instantiated to say whether or not to intercept and convert StopIteration instances that escape from the generator frame. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Sorry for my radio silence in this thread -- it was a combination of a conference and getting sick afterwards. IIUC Nick prefers reusing an existing exception over inventing a new one (to mean that a StopIteration was about to escape from a generator frame). The motivation is that this encourages catching the StopIteration at the source rather than relying on the new exception, thereby encouraging code that also works with previous versions. I like this. Nobody has suggested anything other than RuntimeError so let's go with that. I also like the idea of a __future__ import to request the new behavior in Python 3.5, and a warning if the error condition is detected while the new behavior isn't requested. I assume the __future__ import applies to generator function definitions, not calls. (But someone should reason this through before we commit.) Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix. PS. If we decide not to go ahead with this, there's a small change to the semantics of "return" in a generator that might allow asyncio to distinguish between an intended return statement in the generator and an accidentally escaping StopIteration -- the return case should use a newly defined subclass of StopIteration. asyncio's _step() function can then tell the two situations apart easily. On Tue, Nov 11, 2014 at 4:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 11/13/2014 12:04 PM, Guido van Rossum wrote:
Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix.
Given the confusion about what the problem is, and the possible fixes, I think a short PEP would be in order just so everyone is on the same page. I don't fully understand everything discussed so far (not a big user of generators), so I'll only throw my hat in as a backup volunteer to write the PEP if no one one else is able to take the time. -- ~Ethan~

On 14 November 2014 08:20, Ethan Furman <ethan@stoneleaf.us> wrote:
Agreed, I think it's worth having an explanatory PEP at least for documentation purposes. It also makes it easier to reference from the 3.5 What's New, as there may be some code that's relying on the current behaviour that may need adjusting to use a custom exception type rather than StopIteration (or else refactoring to avoid crossing a generator frame boundary). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Someone please volunteer to write this PEP. I can review, but I need to save my Python time to work on the type hinting PEP. On Thu, Nov 13, 2014 at 11:41 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Nov 15, 2014 at 4:30 AM, Guido van Rossum <guido@python.org> wrote:
Someone please volunteer to write this PEP. I can review, but I need to save my Python time to work on the type hinting PEP.
After the stunning success :) of my last such endeavour (exception-catching expressions), I guess I could put my hand up for this one, if there's no better volunteer forthcoming. ChrisA

On Fri, Nov 14, 2014 at 6:41 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Agreed, I think it's worth having an explanatory PEP at least for documentation purposes.
Draft has been sent to the PEP editors, but if anyone wants to preview it, it's currently here: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-xxx.txt ChrisA

On Thu, Nov 13, 2014 at 12:04:52PM -0800, Guido van Rossum wrote:
Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix.
*puts hand up* I'm not convinced that this is *a problem to be fixed*. At worst it is a "gotcha" to be aware of. The current behaviour is simple to understand: raising StopIteration halts the generator, end of story. I'm still not sure that I understand what the proposed fix is (a PEP will be good to explain that), but if I have understood it correctly, it turns a simple concept like "StopIteration halts the generator" into something more complicated: some StopIterations will halt the generator, others will be chained to a new, unrelated exception. Judging by the complete lack of questions about this on the tutor and python-list mailing lists, the slight difference in behaviour between generator expressions and comprehensions is not an issue in practice. I've seen people ask about the leakage of variables from comprehensions; I've never seen people ask about the different treatment of StopIteration. I have, however, seen people *rely* on that different treatment. E.g. to implement a short-circuiting generator expression that exits when a condition is reached: (expr for x in sequence if cond or stop()) where stop() raises StopIteration and halts the generator. If this change goes ahead, it will break code that does this. Having generator expressions and comprehensions behave exactly the same leads to the question, why do we have comprehensions? I guess the answer is "historical reasons", but from a pragmatic point of view being able to choose between [expr for x in sequence] list(expr for x in sequence) depending on how you want StopIteration to be treated may be useful. -- Steven

On Sat, Nov 15, 2014 at 5:48 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm still not sure that I understand what the proposed fix is (a PEP will be good to explain that)
Draft PEP exists, and now has your concerns incorporated. https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-xxx.txt ChrisA

On Thu, Nov 06, 2014 at 10:54:51AM -0800, Guido van Rossum wrote:
TL;DR :-(
That's how I feel about this whole thread ;-) [...]
Do we need "clever tricks"? In my earlier email, I suggested that if this needs to be fixed, the best way to introduce a change in behaviour is with the __future__ mechanism. 3.5 could introduce from __future__ stopiteration_is_an_error (in my earlier post, I managed to get the suggested behaviour completely backwards) and then 3.6 could raise a warning and 3.7 could make it the default behaviour. We're still breaking backwards compatibility, but at least we're doing it cleanly, without clever and/or ugly hacks. There will be a transition period during which people can choose to keep the old behaviour or the new, and then we transition to the new behaviour. -- Steven

Trying to keep the thread on focus... On Fri, Nov 7, 2014 at 3:20 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Sure. (Though IMO __future__ itself is a clever hack.)
We'd still need to specify the eventual behavior. I propose the following as a strawman (after the deprecation period is finished and the __future__ import is no longer needed): - If a StopIteration is about to bubble out of a generator frame, it is replaced with some other exception (maybe RuntimeError, maybe a new custom Exception subclass, but *not* deriving from StopIteration) which causes the next() call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. During the transition, we check if the generator was defined in the scope of the __future__, and if so, we do the the same thing; otherwise, we issue a warning and let the StopIteration bubble out, eventually terminating some loop or generator expression. It would be nice if, when the warning is made fatal (e.g. through the -W flag), the exception raised was the same one mentioned above (i.e. RuntimeError or a custom subclass -- I don't care much about this detail). -- --Guido van Rossum (python.org/~guido)

On 11/7/2014 12:10 PM, Guido van Rossum wrote:
Double meanings are a constant problem. The ambiguity of StopIteration (unintended bug indicator? intended stop signal?) is not needed within generator functions*. An explicit 'raise StopIteration' is more easily written 'return'. The rare. clever, 'yield expression' that means 'either yield the value of expression or raise StopIteration' can be re-written try: tem = expression except StopIteration: return yield tem An anonymous generator expression with a 'yield or raise' expresssion can be re-written as a named generator function. In this context, allowing 'yield or raise' is too clever in that it breaks the desired (intended?) equivalence between 'list(genexp)' and '[genexp]'. I agree that this is a bad tradeoff. Such alternative could be documented. * StopIteration raised within a __next__ method could be a bug, but a) this should be rare, b) the possibility cannot be eliminated, and c) it would be too expensive to not take StopIteration at face value.
- If a StopIteration is about to bubble out of a generator frame, it is replaced with some other exception (maybe RuntimeError,
I support this particular replacement. To me: 1. The proposal amounts to defining StopIteration escaping a running generator frame as a runtime error.
maybe a new custom Exception subclass, but *not* deriving from StopIteration)
2. There are already more Exception classes than I can remember. 3. This error should be very rare after the transition. 4. A good error message will be at least as important as the class name. 5. A new, specific exception class is an invitation for people to write code that raises StopIteration so that its replacement can be caught in an except clause in the calling code. If the existing builtin exceptions other than StopIteration are not enough to choose from, one can define a custom class.
Any replacement will be ordinary as soon as it is made, before it reaches next(), so I would move this last line up a couple of lines. -- Terry Jan Reedy

On Nov 6, 2014, at 10:54, Guido van Rossum <guido@python.org> wrote:
This is confusing because it breaks the (intended) equivalence between list(<genexp>) and [<genexp>] (even though we refer to the latter as a comprehension, the syntax inside the [] is the same as a generator expression.
If this change (I mean the proposed clever workaround, not the "no terminating generators with StopIteration" change that's too radical) would be sufficient to make that equivalence actually true instead of just pretty close, I think that's reason enough to fix it on its own. Especially since that would make it easy to fix the genexpr docs. (Read 6.2.8 and tell me what it says the semantics of a genexpr are, and what values it yields. Now try to think of a way to fix that without repeating most of the text from 6.2.4, which nobody wants to do. If the docs could just define the semantics of genexprs, then define listcomps by saying that [<genexpr>] is equivalent to list(<genexpr>), that would be a lot simpler and clearer.)

I think you're on to something. But I think both your examples have a problem, even though your second one "works". If we weren't forced by backward compatibility I would have made it much harder for StopIteration to "leak out". Currently a generator can either return or raise StopIteration to signal it is done, but I think it would have been better if StopIteration was treated as some kind of error in this case. Basically I think any time a StopIteration isn't caught by a for-loop or an explicit try/except StopIteraton, I feel there is a bug in the program, or at least it is hard to debug. I'm afraid that ship has sailed, though... On Sat, Nov 1, 2014 at 7:56 AM, yotam vaknin <tomirendo@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 11/1/2014 12:50 PM, Guido van Rossum wrote:
I think you're on to something. But I think both your examples have a problem, even though your second one "works".
Both versions are buggy in that iizip() yields () infinitely, while zip() yields nothing. Fixes below.
This would require some sort of additional special casing of StopIteration that we do not have now. Currently, it is limited to 'for' loops expecting and catching StopIteration as a signal to stop iterating. That is rather easy to understand.
Outside of generator functions (and expressions), I agree as I cannot think of an exception when it is not. This has come up on Python list.
or at least it is hard to debug.
Code within generator functions is different. Writing "raise StopIteration" instead of "return" is mostly a waste of keystrokes. As for next(it), StopIteration should usually propagate, as with an explicit raise and not be caught. The code below that 'works' (when it does work), works because the StopIteration from next(it) (when there is at least one) propagates to the list comp, which lets it pass to the generator, which lets it pass to the generator user.
For the purpose of your example, all instances of StopIteration are the same and might as well be the same instance. Since to my understanding generators and g.e.s already do not catch the StopIterations you say you want not caught, and since you need for it to not be caught in the code below, I do not understand exactly what you propose.
I cannot understand this.
Better test code that avoid infinite looping: a = izip([1,2],[3,4]) for i in range(3): print(next(a)) One the third loop, the above prints (), while the below prints a traceback. With a = izip(), both print () 3 times. The problem is that when next(it) raises, you want the StopIteration instance propagated (not immediately caught), so that the generator-using code knows that the generator is exhausted. But the tuple call catches it first, so that, in combination with 'while True', the user never sees StopIteration A partial solution is to provoke StopIteration before calling tuple, so that it does propagate. That is what the list comp below does. But if args is empty, so is iters, and there is no next(it) to ever raise. For a complete solution that imitates zip and does not require an otherwise useless temporary list, replace the loop with this: while True: t = tuple(next(it) for it in iters) if not t: return yield t
Even thought this is the PEP described behaviour, I think this is an unwanted behaviour.
Not if you think carefully about what you want to happen when next(it) raises. I think generators and generators expressions should be left alone.
This could be fixed with 'if not iters: return' as the second line. Replacing [genexp] with list(genexp) does not work because the latter, unlike the former, catches StopIteration. This is proof that the two are not exactly equivalent, and the such behavior difference I know of (excluding introspection, such as with trace). -- Terry Jan Reedy

On 2 November 2014 02:50, Guido van Rossum <guido@python.org> wrote:
The closest existing example of this kind of generator instance specific StopIteration handling that I can think of is the special case handling of StopIteration in contexlib._GeneratorContextManager.__exit__() (https://hg.python.org/cpython/file/3.4/Lib/contextlib.py#l63). There, the exception handling differentiates between "a specific StopIteration instance that we just threw into the subgenerator" (which it will allow to propagate) and "any other StopIteration instance, which indicates that the wrapped generator iterator terminated as expected" (which it will suppress). We had that wrong initially - if I recall correctly, it was PJE that noticed the problem before 2.5 was released. However, the only reason we were able to make it work is that we knew the exact identity of the exception we were throwing in, rather than just its type - we don't have that luxury in the general case. Getting back to the behaviour that prompted the thread, like a lot of exception handling quirks, it gets back to being very careful about the scope of exception handlers. In this case, the "next(it)" call is inside a generator expression, and hence inside the scope of the expression's StopIteration handling. By contrast, the comprehension version doesn't *have* any implicit exception handling, so the StopIteration escapes to terminate the containing generator. In terms of changing the behaviour of generator expressions to allow other StopIteration instances to propagate, I believe I do see one possible way to do it that limits the degree of backwards incompatibility. Firstly, you'd need to add a general purpose capability to generator iterators: def _set_default_exception(exc): """Supply a specific StopIteration instance to raise when the generator frame returns None""" ... Normally, when the generator execution falls off the end of the frame by returning, the interpreter raises StopIteration if the result is None, or StopIteration(result) if the result is not None. With the new method, you could set a specific instance to be raised when the underlying result of the frame is None. (Side note: "return" and "raise StopIteration" in a generator function aren't *exactly* the same, as only the former relies on the return->raise conversion supplied by the surrounding generator iterator object) That part would be entirely backwards compatible, and would allow you to distinguish whether calling "next", "send" or "throw" on any generator threw StopIteration because the underlying frame returned None (by checking if the StopIteration instance was the one you configured to be raised on a None result), or because it either returned a non-None value or else something running inside that frame threw StopIteration. The backwards incompatible part would be to then also change generator expressions to set a specific StopIteration instance to be raised when the underlying frame returned, and allow all other StopIteration instances to escape, just as contextlib._GeneratorContextManager.__exit__ allows StopIteration instances thrown from the body of the with statement to escape. I think the end result of such a change would definitely be less surprising, as it would make generator expressions behave more like the corresponding comprehensions, and eliminate a hidden infinite loop bug. However, I'm not sure if it's *sufficiently* less surprising to be worth changing - especially since it would mean incurring a small amount of additional runtime overhead for each generator expression. Regards, Nick. P.S. As additional background on the current difference in behaviour between list comprehensions and generator expressions, that has its roots in the same idiosyncrasy where putting a yield expression inside a comprehension actually *turns it into a generator expression*. Comprehensions are full closures, but they don't contain a yield expression, so you get a normal function, which the interpreter then calls. The interpreter doesn't actually do anything particularly special to make a generator expression instead - it just implicitly inserts a yield expression into the closure, which then automatically makes it a generator function instead of a normal one. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2 November 2014 19:53, Nick Coghlan <ncoghlan@gmail.com> wrote:
Bah, I should have fully read Terry's reply before responding. He's right, it's the tuple call that's suppressing the exception, not the generator expression itself. That changes the possible solution, by tweaking it to be an optional extension to the iterator protocol, allowing iterators to make the terminating exception configurable. def __setiterexc__(exc): """Specify the exception instance to raise when the iterator is exhausted""" ... Iterator consumers (like tuple) could then check for that method and use it to set a specific StopIteration instance, allowing all others to escape. I believe actually doing this would be adding too much complexity for too little gain, but it *is* possible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

This is related to the fact that, although the docs imply otherwise, [COMP] isn't exactly equivalent to list(COMP), because of cases like: def ensure_positive(x): if x<=0: raise StopIteration return True eggs = list(x for x in spam if raise_on_negative(x)) def ensure_positive(x); if x<=0: raise StopIteration return x eggs = list(ensure_positive(x) for x in spam) In both cases, this acts like a "takewhile": eggs ends up as a list of the initial positive values, and the first non-positive value is consumed and discarded. But if you do the same thing with a list comprehension, the comprehension is aborted by the StopIteration, and eggs never gets set (although the same values are consumed from spam, of course). IIRC, you asked me what the performance costs would be of changing listcomps to match, and for a trivial comp it worked out to be about 40% for the naive solution (build a genexpr, call it, call list) and about 20% with specially-optimized bytecode. So everyone agreed that even if this is a bug, that would be too much of a cost for too small of a fix. Of course here the issue is almost the opposite. But they're clearly related; how different comprehensions "leak" exceptions differs. Sent from a random iPhone On Nov 2, 2014, at 2:15, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 11/2/2014 2:50 PM, Andrew Barnert wrote:
This is related to the fact that, although the docs imply otherwise, [COMP] isn't exactly equivalent to list(COMP),
That purported equivalence is a common meme, which I may have helped spread. If it is implied in the doc, it should be changed. Before I answered on this thread yesterday, I looked for such an implication in the Language Reference (though not the Tutorial), and only found this carefully written description in the expressions chapter. "In this case, the elements of the new container are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce an element each time the innermost block is reached. Note that the comprehension is executed in a separate scope," IE, a comprehension is equivalent to the result of for and if statements in a separate Python function that first initializes a collection object (list, set, or dict) and augments the collection in the innermost scope with the element produced (which is a key,value pair for dicts). Such an equivalent function would not catch any exception raised by the innermost element expression. On the other hand, collection initializers, such as list or tuple, specifically catch StopIteration when fed an iterable. Hence the two cannot be equivalent when the element expression raises StopIteration. -- Terry Jan Reedy

On Nov 2, 2014, at 13:00, Terry Reedy <tjreedy@udel.edu> wrote:
We looked through this last year and decided nothing needed to be changed in the docs, so I doubt it's worth repeating that effort. IIRC, the tutorial may have been confusing in the past but wasn't as of 3.3, and the only place that might confuse anyone was the what's new in 3.0, which is generally only changed to add notes about things which were un-/re-changed (like u"" strings). But again, even if I'm remembering wrong, I don't think it matters. All that being said, I don't really love the reference docs here. Neither 6.2.8 not anything else explicitly says what the semantics are. It's a pretty obvious guess that the syntax is interpreted the same as for comprehensions (as in 6.2.4), and that the values yielded are those that would be used to produce elements. But the docs don't actually say that.

On Sun, Nov 2, 2014 at 1:00 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I may have started it. I was aware of the non-equivalence (only mostly-equivalence) in Python 2 and I wanted to make then identical in Python 3 -- having one construct being exactly equivalent to another reduce the amount of explaining needed. Unfortunately, people had started to depend on the (in my *current* opinion deplorable) behavior of generator expressions in the face of StopIteration thrown by arbitrary parts of the expression or condition, and the equivalence is still imperfect. At least the variable leakage has been fixed. I know that when we first introduced generators (not generator expressions) I was in favor of interpreting arbitrary things that raise StopIteration in a generator to cause the generator to terminate just as if it had decided to stop (i.e. 'return' or falling of the end), because I thought there were some useful patterns that could be written more compactly this way -- in particular, the pattern where a generator iterates over another iterator by calling next() on it, does some processing on the value thus produced, and then yielding the processed value (or not), and where the logical response to a StopIteration from the inner iterator is to exit the generator. For example: def only_positive(it): while True: x = next(it) if x > 0: yield x This *particular* example is much better written as: def only_positive(x): for x in it: if x > 0: yield x but the idea was that there might be variants where being constrained by a single for-loop would make the code less elegant if you had to catch the StopIteration and explicitly exit the generator. However, I don't think this idea has panned out. I haven't done a survey, but I have a feeling that in most cases where an explicit next() call is used (as opposed to a for-loop) there's a try/except Stopiteration around it, and a fair amount if time is wasted debugging situations where a StopIteration unexpectedly escapes and silently interrupts some loop over an unrelated generator (instead of loudly bubbling up to the top and causing a traceback, which would be more debuggable). And the use case of raising StopIteration from a condition used in a generator expression is iffy at best (it makes the condition function hard to use in other contexts, and it calls to attention the difference between generators and comprehensions). So I will go out on a limb here and say that this was a mistake and if we can think of easing the transitional pain it would be a good thing to fix this eventually. -- --Guido van Rossum (python.org/~guido)

On 3 November 2014 13:01, Guido van Rossum <guido@python.org> wrote:
I think I'm guilty as well - when I was working on the Python 3 changes, getting the *scoping* behaviour to be identical between comprehensions and generator expressions was one of the key objectives, so I regularly described it as making "[x for x in seq]" equivalent to "list(x for x in seq)". I unfortunately didn't notice the remaining exception handling differences at the time, or we might have been able to do something about it for 3.0 :(
Having had to do the dance to work around the current behaviour in contextlib, I'm inclined to agree - there's a significant semantic difference between the "this iterable just terminated" StopIteration, and the "something threw StopIteration and nobody caught it", and the current model hides it. However, I also see potentially significant backwards compatibility problems when it comes to helper functions that throw StopIteration to terminate the calling generator - there would likely need to be some kind of thread local state and a helper "iterexit()" builtin and "PyIter_Exit()" C API to call instead of raising StopIteration directly. Making such a change would involve a lot of code churn just to phase out a relatively obscure issue that mostly just makes certain bugs harder to diagnose (as was the case with the generator expression based izip implementation in this thread), rather than directly causing bugs in its own right. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Nov 2, 2014 at 8:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's what I was afraid of. Can you point me to an example of code that depends on this that isn't trivial like Andrew Barnert's ensure_positive() example? I think that particular example, and the category it represents, are excessive cleverness that abuse the feature under discussion -- but you sound like you have helpers for context managers that couldn't be easily dismissed like that.
Maybe. But the real-life version of that bug can be *really* hard to find, and that's usually the kind of thing we'd like to fix. FWIW the implementation of my proposal is easy to describe (which the Zen approves of): when a StopIteration leaves a frame, replace it with some other exception (either a new, custom one or maybe just RuntimeError), chaining the original StopIteration. It's the consequences that are hard to survey and describe in this case (as they affect subtle code depending on the current semantics). -- --Guido van Rossum (python.org/~guido)

On 3 November 2014 15:59, Guido van Rossum <guido@python.org> wrote:
Sorry, I didn't mean to give that impression. I'm in a similar situation to you in that regard - any specific examples I can think of trip my "that's too obscure to be maintainable" alarm (if there is a reasonable use case, I'd expect it to involve full generators, rather than generator expressions). The code in contextlib relies on the way *generators* handle StopIteration, and if understand your proposal correctly, that would remain unchanged - only ordinary function calls would convert StopIteration to a different exception type, preserving the functional equivalence of a generator returning, raising StopIteration, or having StopIteration thrown into a yield point (it's that last one that contextlib relies on).
That's far more elegant than the stateful possibilities I was considering. So generators would continue to leave StopIteration untouched (preserving the equivalence between returning from the frame and explicitly raising StopIteration from the generator body), and only ordinary function invocations would gain the StopIteration -> UnexpectedStopIteration translation (assuming we went with a new exception type)?
It's the consequences that are hard to survey and describe in this case (as they affect subtle code depending on the current semantics).
Aye. I'm reasonably OK with the notion of breaking overly clever (and hence hard to follow) generator expressions, but I'm a little more nervous about any change that means that factoring out "raise StopIteration" in a full generator function would stop working. That said, such a change would bring generator functions fully into line with the "flow control should be locally visible" principle that guided both the with statement design and asyncio - only a local return or raise statement could gracefully terminate the generator, StopIteration from a nested function call would always escape as the new exception type. If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()". Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Guido van Rossum <guido@python.org> writes:
The pure Python implementation of itertools.groupby() provided in its docs [1] uses next() as the helper function that terminates the calling generator by raising StopIteration [1]: https://docs.python.org/3/library/itertools.html#itertools.groupby Here's a simplified example: from functools import partial def groupby(iterable): """ >>> ' '.join(k for k, g in groupby('AAAABBBCCDAABBB')) 'A B C D A B' >>> ' '.join(''.join(g) for k, g in groupby('AAAABBBCCDAABBB')) 'AAAA BBB CC D AA BBB' """ next_value = partial(next, iter(iterable)) def yield_same(group_value_): # generate group values nonlocal value while value == group_value_: yield value value = next_value() # exit on StopIteration group_value = value = object() while True: while value == group_value: # discard unconsumed values value = next_value() # exit on StopIteration group_value = value yield group_value, yield_same(group_value) The alternative is to return a sentinel from next(): def groupby_done(iterable): done = object() # sentinel next_value = partial(next, iter(iterable), done) def yield_same(group_value_): # generate group values nonlocal value while value == group_value_: yield value value = next_value() if value is done: return group_value = value = object() while value is not done: while value == group_value: # discard unconsumed values value = next_value() if value is done: return group_value = value yield group_value, yield_same(group_value) The first code example exploits the fact that `next(it)` continues to raise StopIteration on subsequent calls. The second code example has to propagate up the stack the termination condition manually (`while True` is replace with `while value is not done`). -- Akira

On Nov 2, 2014, at 21:59, Guido van Rossum <guido@python.org> wrote:
The category is usually represented by the even more trivial and even more abusive example: (prime if prime<100 else throw(StopIteration) for prime in primes) I do see these, mostly on places like StackOverflow, where someone was shown this "cool trick" by someone else, used it without understanding it, and now has no idea how to debug his code. (However, a few people on this list suggested it as an alternative to adding some kind of "syntactic takewhile" to the language, so it's possible not everyone sees it as abuse, even though I think you and others called it abuse back then as well.) Anyway, I agree that explicitly disallowing it would make the language simpler, eliminate more bugs than useful idioms, and possibly open the door to other improvements. But if you can't justify this abuse as being actually illegal by a reading of the docs in 3.0-3.4, and people are potentially using it in real code, wouldn't that require a period of deprecation before it can be broken?

On Mon, Nov 3, 2014 at 1:38 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have to apologize, I pretty much had this backwards. What I should have said is that a generator should always be terminated by a return or falling off the end, and if StopIteration is raised in the generator (either by an explicit raise or raised by something it calls) and not caught by an except clause in the same generator, it should be turned into something else, so it bubbles out as something that keeps getting bubbled up rather than silently ending a for-loop. OTOH StopIteration raised by a non-generator function should not be mutated. I'm sorry if this invalidates Nick's endorsement of the proposal. I definitely see this as a serious backward incompatibility: no matter how often it leads to buggy or obfuscated code, it's how things have always worked. Regarding Akira Li's examples of groupby(), unfortunately I find both versions inscrutable -- on a casual inspection I have no idea what goes on. I would have to think about how I would write groupby() myself (but I'm pretty sure I wouldn't us functools.partial(). :-) -- --Guido van Rossum (python.org/~guido)

On 11/03/2014 06:29 PM, Guido van Rossum wrote:
I agree, and just happened to try. After several tries, it's tricky, I come to the conclusion it fits the pattern of peeking at the first value of the iterater best. It seems to be a fairly common pattern and hard to get right with iterators. For what it's worth here is what I came up with. :-) class PeekIter: """ Create an iterator which allows you to peek() at the next item to be yielded. It also stores the exception and re-raises it on the next next() call. """ def __init__(self, iterable): self._exn = None self._it = iter(iterable) self._peek = next(iter(iterable)) def peek(self): return self._peek def ok(self): #True if no exception occured when setting self._peek return self._exn == None def __iter__(self): return iter(self) def __next__(self): if self._exn != None: raise self._exn t = self._peek try: self._peek = next(self._it) except Exception as exn: self._exn = exn self._peek = None return t def group_by(iterable): """ Yield (key, (group))'s of like values from an iterator. """ itr = PeekIter(iterable) def group(it): # Yield only like values. k = it.peek() while it.peek() == k: yield next(it) while itr.ok(): yield itr.peek(), [x for x in group(itr)] print(' '.join(''.join(k) for k, v in group_by('AAAABBBCCDAABBB'))) # 'A B C D A B' print(' '.join(''.join(g) for k, g in group_by('AAAABBBCCDAABBB'))) # 'AAAA BBB CC D AA BBB'

Just a very minor correction, list(group(itr) instead of list comp. ... On 11/03/2014 09:17 PM, Ron Adam wrote:
def group_by(iterable): """ Yield (key, (group))'s of like values from an iterator. """ def group(it): # Yield only like values. k = it.peek() while it.peek() == k: yield next(it) itr = PeekIter(iterable) while itr.ok(): yield itr.peek(), list(group(itr)) Cheers, Ron

A bad edit got me... Don't mean to side track the Stopiteration discussion, but also don't want anyone having to debug my mistakes. In the PeekIter __init__ class... On 11/03/2014 09:17 PM, Ron Adam wrote:
self._peek = next(iter(iterable))
to... self._peek = next(self._it) It was yielding an extra value on the first group. As for the handling of the StopIteration, Maybe adding a StopGenerator exception could ease the way. It could be guaranteed to not to escape the generator it's raised in, while a StopIteration could propagate out until it caught by a loop. Also StopGenerator(value) could be equivalent to return value. The two for now at least overlap. StopIterator(value) would be caught and converted. While a bare StopIterator would propogate out. The new exception would make for clearor and easier to understand code in coroutines I think. Another aspect of StopGenerator is it would escape inner loops of the generator it's raised in. Like return does. And so you throw a StopGenerator exception into generator who's yield is nested in multiple loops to stop it. Of course all bets are off until it's actually tried I think. Just because it sound good here, (or to me at the moment), doesn't mean it will work. Does it. :-) Cheers, Ron

On 4 November 2014 00:29, Guido van Rossum <guido@python.org> wrote:
It's worth noting that the functools.partial doesn't really do anything. Just changing next_value = partial(next, iter(iterable)) to iterator = iter(iterable) and "next_value()" to "next(iterator)" gets rid of it. I would have tackled it by letting the inner iterator set a "finished" flag and I would have exhausted the iterable by iterating over it: def groupby(iterable): # Make sure this is one-pass iterator = iter(iterable) finished = False # Yields a group def yield_group(): nonlocal finished, group_key # This was taken off the iterator # by the previous group yield group_key for item in iterator: if item != group_key: # Set up next group group_key = item return yield item # No more items in iterator finished = True # This is always the head of the next # or current group group_key = next(iterator) while not finished: group = yield_group() yield group_key, group # Make sure the iterator is exhausted for _ in group: pass # group_key will now be the head of the next group This does have a key difference. Whereas with groupby you have the confusing property that from itertools import groupby grps = groupby("|||---|||") a = next(grps) b = next(grps) c = next(grps) list(a[1]) #>>> ['|', '|', '|'] with mine this does not happen: grps = groupby("|||---|||") grps = groupby("|||---|||") a = next(grps) b = next(grps) c = next(grps) list(a[1]) #>>> [] Was this an oversight in the original design or is this actually desired? I would guess it's an oversight.

I guess implementing groupby() would be a good interview question. :-) Can we get back to the issue at hand, which is whether and how we can change the behavior of StopIteraton to be less error-prone? On Mon, Nov 3, 2014 at 8:10 PM, Joshua Landau <joshua@landau.ws> wrote:
-- --Guido van Rossum (python.org/~guido)

Firstly, sorry for my buggy code that began all this mess. But what about adding a named parameter to the next function, specifying an exception to be raised on StopIteration, allowing it to propagate? ב-4 בנוב 2014, בשעה 06:49, Guido van Rossum <guido@python.org> כתב/ה:

On 11/4/2014 6:12 PM, Yotam Vaknin wrote:
There is already a 'default' option, so the two would have to be mutually exclusive. "next(iterator[, default]) Retrieve the next item from the iterator by calling its __next__() method. If default is given, it is returned if the iterator is exhausted, otherwise StopIteration is raised." I believe either adding an alternate exception, or raising the default if it is an exception, has been discussed before on this list. In cannot remember details other than the obvious fact that nothing was changed. -- Terry Jan Reedy

On 5/11/2014 1:55 p.m., Terry Reedy wrote:
If it were a keyword-only argument, there wouldn't be any conflict. But I'm not sure I like the idea of adding more complexities to the iterator protocol. Anything that changes the signature of __next__ or __iter__ would be a bad idea, since all existing iterables would need to be updated to take it into account. If this is is to be done, it might be better to add a new optional dunder method, so that existing iterables would continue to work. -- Greg

On 4 Nov 2014 10:31, "Guido van Rossum" <guido@python.org> wrote:
On Mon, Nov 3, 2014 at 1:38 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>
wrote:
I actually thought this was what you meant originally, and while it *would* require changes to contextlib, they're fairly minor: the current check for StopIteration would change to catch "UnexpectedStopIteration" instead, and the exception identity check would look at __cause__ rather than directly at the caught exception.
Aye, breaking the equivalence between "return" and "raise StopIteration" is pretty major. I'm not even sure a plausible transition plan is possible, as at least contextlib would trigger any warning we might issue. Regards, Nick.

On 6 Nov 2014 00:09, "Nick Coghlan" <ncoghlan@gmail.com> wrote: than directly at the caught exception.
I definitely see this as a serious backward incompatibility: no matter
how often it leads to buggy or obfuscated code, it's how things have always worked.
Aye, breaking the equivalence between "return" and "raise StopIteration"
is pretty major.
I'm not even sure a plausible transition plan is possible, as at least
contextlib would trigger any warning we might issue. And having said that... what if we introduced UnexpectedStopIteration but initially made it a subclass of StopIteration? We could issue a deprecation warning whenever we triggered the StopIteration -> UnexpectedStopIteration conversion, pointing out that at some point in the future (3.6? 3.7?), UnexpectedStopIteration will no longer be a subclass of StopIteration (perhaps becoming a subclass of RuntimeError instead?). contextlib could avoid the warning by preconstructing a suitable UnexpectedStopIteration instance and throwing *that* into the generator, rather than throwing in a StopIteration raised from the with statement body. Regards, Nick.
Regards, Nick.

On Wed, Nov 5, 2014 at 5:17 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
And having said that... what if we introduced UnexpectedStopIteration but initially made it a subclass of StopIteration?
Exactly! -- Juancarlo *Añez*

On 11/05/2014 03:47 PM, Nick Coghlan wrote:
I really don't like the name Unexpected anything. It makes me think of blue screens and page faults. :-/ I'm also not sure how it's supposed to work or where it should come from. As near as I can tell these two examples below are equivalent. I think the thing that needs to be avoided is the case of the endless loop. It would be better to let the Exception be noisy. How would the second example be changed in order to do that? Or is there some other thing that needs to be fixed? Cheers, Ron def izip(*args): iters = [iter(obj) for obj in args] while True: yield list(next(it) for it in iters) #StopIteration suppressed #by list comprehension. #resulting in empty lists. #While Loop never exits. print("never printed") a = izip([1,2],[3,4]) print(next(a),next(a),next(a)) # (1, 3) (2, 4) () #list(izip([1,2],[3,4])) #Currently never returns def izip2(*args): iters = [iter(obj) for obj in args] while True: L = [] for it in iters: try: obj = next(it) # StopIteration suppressed here. except StopIteration: break # exit for-loop # "return" instead of "break"? L.append(obj) yield L # While Loop never exits. print("never printed") a = izip2([5,6],[7,8]) print(next(a),next(a),next(a)) # (5, 7) (6, 8) () list(izip2([5,6],[7,8])) #Currently never returns # Not only doesn't it exit, but it's building # an endless list of empty lists!

On Thu, Nov 06, 2014 at 07:47:09AM +1000, Nick Coghlan wrote:
I'm sorry, I have been trying to follow this thread, but there have been too many wrong turns and side-tracks for me to keep it straight. What is the problem this is supposed to solve? Is it just that list (and set and dict) comprehensions treat StopIteration differently than generator expressions? That is, that [expr for x in iterator] list(expr for x in iterator) are not exactly equivalent, if expr raises StopIteration. If so, it seems to me that you're adding a lot of conceptual baggage and complication for very little benefit, and this will probably confuse people far more than the current situation does. The different treatment of StopIteration in generator expressions and list comprehensions does not seem to be a problem for people in practice, judging by the python-list and tutor mailing lists. The current situation is simple to learn and understand: (1) Generator expressions *emit* StopIteration when they are done: py> next(iter([])) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration (2) Functions such as tuple, list, set, dict *absorb* StopIteration: py> list(iter([])) [] py> it = iter([]) py> list(next(it) for y in range(1000)) [] For-loops do the same, if StopIteration is raised in the "for x in iterable" header. That's how it knows the loop is done. The "for" part of a comprehension is the same. (3) But raising StopIteration in the expression part (or if part) of a comprehension does not absord the exception, it is treated like any other exception: py> [next(iter([])) for y in range(1000)] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> StopIteration If that is surprising to anyone, I suggest it is because they haven't considered what happens when you raise StopIteration in the body of a for-loop: py> for y in range(1000): ... next(iter([])) ... Traceback (most recent call last): File "<stdin>", line 2, in <module> StopIteration To me, the status quo is consistent, understandable, and predictable. In contrast, you have: - a solution to something which I'm not sure is even a problem that needs solving; - but if it does, the solution seems quite magical, complicated, and hard to understand; - it is unclear (to me) under what circumstances StopIteration will be automatically converted to UnexpectedStopIteration; - and it seems to me that it will lead to surprising behaviour when people deliberately raise StopIteration only to have it mysteriously turn into a different exception, but only sometimes. It seems to me that if the difference between comprehensions and generator expressions really is a problem that needs solving, that the best way to proceed is using the __future__ mechanism. 3.5 could introduce from __future__ comprehensions_absorb_stopiteration and then 3.6 or 3.7 could make it the default behaviour. We're still breaking backwards compatibility, but at least we're doing it cleanly, without magic (well, except the __future__ magic, but that's well-known and acceptible magic). There will be a transition period during which people can choose to keep the old behaviour or the new, and then we transition to the new behaviour. This automatic transformation of some StopIterations into something else seems like it will be worse than the problem it is trying to fix. For what it is worth, I'm a strong -1 on changing the behaviour of comprehensions at all, but if we must change it in a backwards incompatible way, +1 on __future__ and -1 on changing the exceptions to a different exception. -- Steven

On 11/06/2014 04:15 AM, Steven D'Aprano wrote:
It's the "when they are done" part that's having the issue. In some cases, they are never completely done because the StopIteration Error is handled incorrectly. The reason this doesn't show up more is that most iterators only have one loop, which is exited and the generator ends causing a different StopIteration to be emitted from the one that ends the loop.
Right, sometimes this doesn't happen.
I think this part is working as it should.
This is not always working. When a StopIteration is raised from the next call in the example that started the thread, it's getting replaced with the equivalent of a break, so it's never exiting the generator completely. I think it needs to be replaced with the equivalent of return, which will end the generator, and cause a StopIteration to be emitted. (As you describe here.) It looks like it's a bug in the C code for generator expressions. Since it only effects generator expressions that have iterators with nested loops, it may be fixible. Cheers, Ron

On 11/6/2014 5:15 AM, Steven D'Aprano wrote:
'iter([])' is a list_iterator, not a generator expression. Here is the example I think you wanted.
Which is the Python translation of the comprehension.
I agree. -- Terry Jan Reedy

I believe the issue is that, in some situations, the generator is absorbing StopIteration when it shouldn't. -- ~Ethan~

TL;DR :-( The crux of the issue is that sometimes the StopIteration raised by an explicit next() call is "just" an exception, but sometimes it terminates some controlling loop. Most precedents clearly establish it as "just" an exception, e.g. (these are all nonsense) it = iter([1, 2, 3, -1]) for x in it: if x < 0: next(it) it = iter([1, 2, 3]) [x for x in it if next(it) < 0] Both these raise StopIteration and report a traceback. But generators and generator expressions are different. In a generator, any StopIteration that isn't caught by an exception handler in the body of the generator implicitly terminates the iteration, just like "return" or falling off the end. This was an explicitly designed feature, but I don't think has worked out very well, given that more often than not, a StopIteration that "escapes" from some expression is a bug. And because generator expressions are implemented using generators (duh), the following returns [] instead of raising StopIteration: it = iter([1, 2, 3]) list(x for x in it if next(it) < 0) This is confusing because it breaks the (intended) equivalence between list(<genexp>) and [<genexp>] (even though we refer to the latter as a comprehension, the syntax inside the [] is the same as a generator expression. If I had had the right foresight, I would have made it an error to terminate a generator with a StopIteration, probably by raising another exception chained to the StopIteration (so the traceback shows the place where the StopIteration escaped). The question at hand is if we can fix this post-hoc, using clever tricks and (of course) a deprecation period. --Guido On Thu, Nov 6, 2014 at 2:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)

On Thu, Nov 06, 2014 at 08:01:32PM -0430, Juancarlo Añez wrote:
What about it? There are 19 lines in the Zen, and I don't think any of them are particularly relevent here. Which line, or lines, were you thinking of? -- Steven

On Fri, Nov 7, 2014 at 5:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:
You asked what was the point in fixing the current behavior regarding StopIteration: - - Beautiful is better than ugly. - Explicit is better than implicit. - Simple is better than complex. - Flat is better than nested. - Readability counts. - Special cases aren't special enough to break the rules. - Errors should never pass silently. - In the face of ambiguity, refuse the temptation to guess. - There should be one-- and preferably only one --obvious way to do it. - Now is better than never. - If the implementation is hard to explain, it's a bad idea. Basically, that some of the behaviours Guido mentioned are unexpected consequences of a lack of foresight in the implementation of generator expressions. It is not in the Zen of Python to leave it as is just because some existing code may be relying on the odd behavior. Just to be clear, that StopIteration will cancel more than one iterator is an unintended behavior that is something difficult to explain, is of questionable usefulness, and is the source of difficult to catch bugs. Cheers, -- Juancarlo *Añez*

On Fri, Nov 7, 2014 at 11:57 PM, Juancarlo Añez <apalala@gmail.com> wrote:
It is not in the Zen of Python to leave it as is just because some existing code may be relying on the odd behavior.
Actually, it is. Practicality beats purity. :) ChrisA

On 7 November 2014 07:45, Antoine Pitrou <solipsis@pitrou.net> wrote:
It's not about people relying on the current behaviour (it can't be, since we're talking about *changing* that behaviour), it's about "Errors should never pass silently". That is, the problematic cases that (at least arguably) may be worth fixing are those where: 1. StopIteration escapes from an expression (Error!) 2. Instead of causing a traceback, it terminates a containing generator (Passing silently!) As asyncio coroutines become more popular, I predict some serious head scratching from StopIteration escaping an asynchronous operation and getting thrown into a coroutine, which then terminates with a "return None" rather than propagating the exception as you might otherwise expect. The problem with this particular style of bug is that the only trace it leaves is a generator iterator that terminates earlier than expected - there's no traceback, log message, or any other indication of where something strange may be happening. Consider the following, from the original post in the thread: def izip(*args): iters = [iter(obj) for obj in args] while True: yield tuple([next(it) for it in iters]) The current behaviour of that construct is that, as soon as one of the iterators is empty: 1. next(it) throws StopIteration 2. the list comprehension unwinds the frame, and allows the exception to propagate 3. the generator iterator unwinds the frame, and allows the exception to propagate 4. the code invoking the iterator sees StopIteration and assumes iteration is complete If you switch to the generator expression version instead, the flow control becomes: 1. next(it) throws StopIteration 2. the generator expression unwinds the frame, and allows the exception to propagate 3. the iteration inside the tuple constructor sees StopIteration and halts 4. the generator iterator never terminates In that code, "next(it)" is a flow control operation akin to break (it terminates the nearest enclosing generator iterator, just as break terminates the nearest enclosing loop), but it's incredibly unclear that this is the case - there's no local indication that it may raise StopIteration, you need to "just know" that raising StopIteration is a possibility. Guido's suggestion is to consider looking for a viable way to break the equivalence between "return" and "raise StopIteration" in generator iterators - that way, the only way for the above code to work would be through a more explicit version that clearly tracks the flow control. Option 1 would be to assume we use a new exception, and are OK with folks catching it explicitly from __future__ import explicit_generator_return def izip(*args): iters = [iter(obj) for obj in args] while True: try: t = tuple(next(it) for it in iters) except UncaughtStopIteration: return # One of the iterators has been exhausted yield t Option 2 would be to assume the new exception is something generic like RuntimeError, requiring the inner loop to be converted to statement form: def izip(*args): iters = [iter(obj) for obj in args] while True: entry = [] for it in iters: try: item = next(it) except StopIteration: return # One of the iterators has been exhausted entry.append(item) yield tuple(entry) With option 2, you can also still rely on the fact that list comprehensions don't create a generator frame: def izip(*args): iters = [iter(obj) for obj in args] while True: try: entry = [next(it) for it in iters] except StopIteration: return # One of the iterators has been exhausted yield tuple(entry) The upside of the option 2 spellings is they'll work on all currently supported versions of Python, while the downside is the extra object construction they have to do if you want to yield something other than a list. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 11/11/2014 06:21 AM, Nick Coghlan wrote:
When I was experimenting with this earlier, I needed the try-except to catch the StopIteration exception in order to do a "break". Which gave me the current behaviour of the generator expression being discussed. Replacing break with return, as above, gave the expected behaviour, but also just removing the try-except and letting the StopIteration propagate out, worked as well. That is, StopIteration(None) is equivalent to "return None" in the context above. Can you point me to the source file that implements generator expression byte code or C code? I wanted to look at that to see what was actually going on, but it looks like it may be a combination of a regular generator with a condition at some point to handle it slightly different. Cheers, Ron

On 12 November 2014 02:38, Ron Adam <ron3200@gmail.com> wrote:
When a StopIteration escapes from a generator frame, it's currently just propagated without modification. When the generator frame *returns*, the generator object converts that to raising StopIteration: https://hg.python.org/cpython/file/30a6c74ad87f/Objects/genobject.c#l117 An implementation of Guido's __future__ import idea would likely involve setting a flag on generator iterators when they're instantiated to say whether or not to intercept and convert StopIteration instances that escape from the generator frame. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Sorry for my radio silence in this thread -- it was a combination of a conference and getting sick afterwards. IIUC Nick prefers reusing an existing exception over inventing a new one (to mean that a StopIteration was about to escape from a generator frame). The motivation is that this encourages catching the StopIteration at the source rather than relying on the new exception, thereby encouraging code that also works with previous versions. I like this. Nobody has suggested anything other than RuntimeError so let's go with that. I also like the idea of a __future__ import to request the new behavior in Python 3.5, and a warning if the error condition is detected while the new behavior isn't requested. I assume the __future__ import applies to generator function definitions, not calls. (But someone should reason this through before we commit.) Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix. PS. If we decide not to go ahead with this, there's a small change to the semantics of "return" in a generator that might allow asyncio to distinguish between an intended return statement in the generator and an accidentally escaping StopIteration -- the return case should use a newly defined subclass of StopIteration. asyncio's _step() function can then tell the two situations apart easily. On Tue, Nov 11, 2014 at 4:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 11/13/2014 12:04 PM, Guido van Rossum wrote:
Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix.
Given the confusion about what the problem is, and the possible fixes, I think a short PEP would be in order just so everyone is on the same page. I don't fully understand everything discussed so far (not a big user of generators), so I'll only throw my hat in as a backup volunteer to write the PEP if no one one else is able to take the time. -- ~Ethan~

On 14 November 2014 08:20, Ethan Furman <ethan@stoneleaf.us> wrote:
Agreed, I think it's worth having an explanatory PEP at least for documentation purposes. It also makes it easier to reference from the 3.5 What's New, as there may be some code that's relying on the current behaviour that may need adjusting to use a custom exception type rather than StopIteration (or else refactoring to avoid crossing a generator frame boundary). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Someone please volunteer to write this PEP. I can review, but I need to save my Python time to work on the type hinting PEP. On Thu, Nov 13, 2014 at 11:41 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Nov 15, 2014 at 4:30 AM, Guido van Rossum <guido@python.org> wrote:
Someone please volunteer to write this PEP. I can review, but I need to save my Python time to work on the type hinting PEP.
After the stunning success :) of my last such endeavour (exception-catching expressions), I guess I could put my hand up for this one, if there's no better volunteer forthcoming. ChrisA

On Fri, Nov 14, 2014 at 6:41 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Agreed, I think it's worth having an explanatory PEP at least for documentation purposes.
Draft has been sent to the PEP editors, but if anyone wants to preview it, it's currently here: https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-xxx.txt ChrisA

On Thu, Nov 13, 2014 at 12:04:52PM -0800, Guido van Rossum wrote:
Do we need a PEP for this or can we just go ahead? I haven't heard any pushback on the basic premise that this is a problem we'd like to fix.
*puts hand up* I'm not convinced that this is *a problem to be fixed*. At worst it is a "gotcha" to be aware of. The current behaviour is simple to understand: raising StopIteration halts the generator, end of story. I'm still not sure that I understand what the proposed fix is (a PEP will be good to explain that), but if I have understood it correctly, it turns a simple concept like "StopIteration halts the generator" into something more complicated: some StopIterations will halt the generator, others will be chained to a new, unrelated exception. Judging by the complete lack of questions about this on the tutor and python-list mailing lists, the slight difference in behaviour between generator expressions and comprehensions is not an issue in practice. I've seen people ask about the leakage of variables from comprehensions; I've never seen people ask about the different treatment of StopIteration. I have, however, seen people *rely* on that different treatment. E.g. to implement a short-circuiting generator expression that exits when a condition is reached: (expr for x in sequence if cond or stop()) where stop() raises StopIteration and halts the generator. If this change goes ahead, it will break code that does this. Having generator expressions and comprehensions behave exactly the same leads to the question, why do we have comprehensions? I guess the answer is "historical reasons", but from a pragmatic point of view being able to choose between [expr for x in sequence] list(expr for x in sequence) depending on how you want StopIteration to be treated may be useful. -- Steven

On Sat, Nov 15, 2014 at 5:48 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm still not sure that I understand what the proposed fix is (a PEP will be good to explain that)
Draft PEP exists, and now has your concerns incorporated. https://raw.githubusercontent.com/Rosuav/GenStopIter/master/pep-xxx.txt ChrisA

On Thu, Nov 06, 2014 at 10:54:51AM -0800, Guido van Rossum wrote:
TL;DR :-(
That's how I feel about this whole thread ;-) [...]
Do we need "clever tricks"? In my earlier email, I suggested that if this needs to be fixed, the best way to introduce a change in behaviour is with the __future__ mechanism. 3.5 could introduce from __future__ stopiteration_is_an_error (in my earlier post, I managed to get the suggested behaviour completely backwards) and then 3.6 could raise a warning and 3.7 could make it the default behaviour. We're still breaking backwards compatibility, but at least we're doing it cleanly, without clever and/or ugly hacks. There will be a transition period during which people can choose to keep the old behaviour or the new, and then we transition to the new behaviour. -- Steven

Trying to keep the thread on focus... On Fri, Nov 7, 2014 at 3:20 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Sure. (Though IMO __future__ itself is a clever hack.)
We'd still need to specify the eventual behavior. I propose the following as a strawman (after the deprecation period is finished and the __future__ import is no longer needed): - If a StopIteration is about to bubble out of a generator frame, it is replaced with some other exception (maybe RuntimeError, maybe a new custom Exception subclass, but *not* deriving from StopIteration) which causes the next() call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. During the transition, we check if the generator was defined in the scope of the __future__, and if so, we do the the same thing; otherwise, we issue a warning and let the StopIteration bubble out, eventually terminating some loop or generator expression. It would be nice if, when the warning is made fatal (e.g. through the -W flag), the exception raised was the same one mentioned above (i.e. RuntimeError or a custom subclass -- I don't care much about this detail). -- --Guido van Rossum (python.org/~guido)

On 11/7/2014 12:10 PM, Guido van Rossum wrote:
Double meanings are a constant problem. The ambiguity of StopIteration (unintended bug indicator? intended stop signal?) is not needed within generator functions*. An explicit 'raise StopIteration' is more easily written 'return'. The rare. clever, 'yield expression' that means 'either yield the value of expression or raise StopIteration' can be re-written try: tem = expression except StopIteration: return yield tem An anonymous generator expression with a 'yield or raise' expresssion can be re-written as a named generator function. In this context, allowing 'yield or raise' is too clever in that it breaks the desired (intended?) equivalence between 'list(genexp)' and '[genexp]'. I agree that this is a bad tradeoff. Such alternative could be documented. * StopIteration raised within a __next__ method could be a bug, but a) this should be rare, b) the possibility cannot be eliminated, and c) it would be too expensive to not take StopIteration at face value.
- If a StopIteration is about to bubble out of a generator frame, it is replaced with some other exception (maybe RuntimeError,
I support this particular replacement. To me: 1. The proposal amounts to defining StopIteration escaping a running generator frame as a runtime error.
maybe a new custom Exception subclass, but *not* deriving from StopIteration)
2. There are already more Exception classes than I can remember. 3. This error should be very rare after the transition. 4. A good error message will be at least as important as the class name. 5. A new, specific exception class is an invitation for people to write code that raises StopIteration so that its replacement can be caught in an except clause in the calling code. If the existing builtin exceptions other than StopIteration are not enough to choose from, one can define a custom class.
Any replacement will be ordinary as soon as it is made, before it reaches next(), so I would move this last line up a couple of lines. -- Terry Jan Reedy

On Nov 6, 2014, at 10:54, Guido van Rossum <guido@python.org> wrote:
This is confusing because it breaks the (intended) equivalence between list(<genexp>) and [<genexp>] (even though we refer to the latter as a comprehension, the syntax inside the [] is the same as a generator expression.
If this change (I mean the proposed clever workaround, not the "no terminating generators with StopIteration" change that's too radical) would be sufficient to make that equivalence actually true instead of just pretty close, I think that's reason enough to fix it on its own. Especially since that would make it easy to fix the genexpr docs. (Read 6.2.8 and tell me what it says the semantics of a genexpr are, and what values it yields. Now try to think of a way to fix that without repeating most of the text from 6.2.4, which nobody wants to do. If the docs could just define the semantics of genexprs, then define listcomps by saying that [<genexpr>] is equivalent to list(<genexpr>), that would be a lot simpler and clearer.)
participants (17)
-
Akira Li
-
Andrew Barnert
-
Antoine Pitrou
-
Chris Angelico
-
Ethan Furman
-
Greg
-
Greg Ewing
-
Guido van Rossum
-
Joshua Landau
-
Juancarlo Añez
-
Nick Coghlan
-
Ron Adam
-
Steven D'Aprano
-
Terry Reedy
-
Tim Delaney
-
yotam vaknin
-
Yotam Vaknin