[Python-Dev] Please reconsider PEP 479.

Mark Shannon mark at hotpy.org
Mon Nov 24 01:25:04 CET 2014



On 23/11/14 22:54, Chris Angelico wrote:
> On Mon, Nov 24, 2014 at 7:18 AM, Mark Shannon <mark at hotpy.org> wrote:
>> Hi,
>>
>> I have serious concerns about this PEP, and would ask you to reconsider it.
>
> Hoping I'm not out of line in responding here, as PEP author. Some of
> your concerns (eg "5 days is too short") are clearly for Guido, not
> me, but perhaps I can respond to the rest of it.
>
>> [ Very short summary:
>>      Generators are not the problem. It is the naive use of next() in an
>> iterator that is the problem. (Note that all the examples involve calls to
>> next()).
>>      Change next() rather than fiddling with generators.
>> ]
>>
>> StopIteration is not a normal exception, indicating a problem, rather it
>> exists to signal exhaustion of an iterator.
>> However, next() raises StopIteration for an exhausted iterator, which really
>> is an error.
>> Any iterator code (generator or __next__ method) that calls next() treats
>> the StopIteration as a normal exception and propogates it.
>> The controlling loop then interprets StopIteration as a signal to stop and
>> thus stops.
>> *The problem is the implicit shift from signal to error and back to signal.*
>
> The situation is this: Both __next__ and next() need the capability to
> return literally any object at all. (I raised a hypothetical
> possibility of some sort of sentinel object, but for such a sentinel
> to be useful, it will need to have a name, which means that *by
> definition* that object would have to come up when iterating over the
> .values() of some namespace.) They both also need to be able to
> indicate a lack of return value. This means that either they return a
> (success, value) tuple, or they have some other means of signalling
> exhaustion.

You are grouping next() and it.__next__() together, but they are different.
I think we agree that the __next__() method is part of the iterator 
protocol and should raise StopIteration.
There is no fundamental reason why next(), the builtin function, should 
raise StopIteration, just because  __next__(), the method, does.
Many xxx() functions that wrap __xxx__() methods add additional 
functionality.

Consider max() or min(). Both of these methods take an iterable and if 
that iterable is empty they raise a ValueError.
If next() did likewise then the original example that motivates this PEP
would not be a problem.

>
> I'm not sure what you mean by your "However" above. In both __next__
> and next(), this is a signal; it becomes an error as soon as you call
> next() and don't cope adequately with the signal, just as KeyError is
> an error.
>
>> 2. The proposed solution does not address this issue at all, but rather
>> legislates against generators raising StopIteration.
>
> Because that's the place where a StopIteration will cause a silent
> behavioral change, instead of cheerily bubbling up to top-level and
> printing a traceback.
I must disagree. It is the FOR_ITER bytecode (implementing a loop or 
comprehension) that "silently" converts a StopIteration exception into a 
branch.

I think the generator's __next__() method handling of exceptions is 
correct; it propogates them, like most other code.

>
>> 3. Generators and the iterator protocol were introduced in Python 2.2, 13
>> years ago.
>> For all of that time the iterator protocol has been defined by the
>> __iter__(), next()/__next__() methods and the use of StopIteration to
>> terminate iteration.
>>
>> Generators are a way to write iterators without the clunkiness of explicit
>> __iter__() and next()/__next__() methods, but have always obeyed the same
>> protocol as all other iterators. This has allowed code to rewritten from one
>> form to the other whenever desired.
>>
>> Do not forget that despite the addition of the send() and throw() methods
>> and their secondary role as coroutines, generators have primarily always
>> been a clean and elegant way of writing iterators.
>
> This question has been raised several times; there is a distinct
> difference between __iter__() and __next__(), and it is only the
I just mentioned __iter__ as it is part of the protocol, I agree that 
__next__ is relevant method.
> latter which is aware of StopIteration. Compare these three classes:
>
> class X:
>      def __init__(self): self.state=0
>      def __iter__(self): return self
>      def __next__(self):
>          if self.state == 3: raise StopIteration
>          self.state += 1
>          return self.state
>
> class Y:
>      def __iter__(self):
>          return iter([1,2,3])
>
> class Z:
>      def __iter__(self):
>          yield 1
>          yield 2
>          yield 3
>
> Note how just one of these classes uses StopIteration, and yet all
> three are iterable, yielding the same results. Neither Y nor Z is
> breaking iterator protocol - but neither of them is writing an
> iterator, either.

All three raise StopIteration, even if it is implicit.
This is trivial to demonstrate:

def will_it_raise_stop_iteration(it):
     try:
         while True:
             it.__next__()
     except StopIteration:
         print("Raises StopIteration")
     except:
         print("Raises something else")

>
>> 4. Porting from Python 2 to Python 3 seems to be hard enough already.
>
> Most of the code broken by this change can be fixed by a mechanical
> replacement of "raise StopIteration" with "return"; the rest need to
> be checked to see if they're buggy or unclear. There is an edge case
> with "return some_value" vs "raise StopIteration(some_value)" (the
> former's not compatible with 2.7), but apart from that, the
> recommended form of code for 3.7 will work in all versions of Python
> since 2.2.
I think that when it comes to porting 2 to 3, the perception is more 
important than the technical difficultly. Sadly :(

>
>> 5. I think I've already covered this in the other points, but to reiterate
>> (excuse the pun):
>> Calling next() on an exhausted iterator is, I would suggest, a logical
>> error.
>
> How do you know that it's exhausted, other than by calling next() on it?
Either we add a new method, or you have to handle the exception 
explicitly. But that is what you are trying to force anyway.

I probably should have said "Calling next(), without guarding against 
the possibility that the iterator is exhausted, is a logical error."

>
>> It also worth noting that calling next() is the only place a StopIteration
>> exception is likely to occur outside of the iterator protocol.
>
> This I agree with.
>
>> An example
>> ----------
>>
>> Consider a function to return the value from a set with a single member.
>> def value_from_singleton(s):
>>      if len(s) < 2:  #Intentional error here (should be len(s) == 1)
>>         return next(iter(s))
>>      raise ValueError("Not a singleton")
>>
>> Now suppose we pass an empty set to value_from_singleton(s), then we get a
>> StopIteration exception, which is a bit weird, but not too bad.
>
> Only a little weird - and no different from the way you'd get a
> TypeError if you pass it an integer.
Except that TypeError is what is says, an error. StopIteration is a 
special not-really-an-error thing.

>
>> However it is when we use it in a generator (or in the __next__ method of an
>> iterator) that we get a serious problem.
>> Currently the iterator appears to be exhausted early, which is wrong.
>> However, with the proposed change we get RuntimeError("generator raised
>> StopIteration") raised, which is also wrong, just in a different way.
>
> What you have here is two distinct issues. The first is "what happens
> if an unexpected StopIteration occurs during __next__ processing?",
> and the second is "ditto ditto a generator's execution?". The first
> one is extremely hard to deal with, and extremely unlikely. The second
> is much easier to deal with, and can therefore be solved.
I don't think there are two distinct issues. It is only the combination 
of the two that causes a real problem.

There are two places that StopIteration could be convert into a "real" 
exception. In the next() function or in the generator.__next__() method.
Doing so in next() is, IMO, simpler and easier to understand and explain.

>
>> Solutions
>> ---------
>> My preferred "solution" is to do nothing except improving the documentation
>> of next(). Explain that it can raise StopIteration which, if allowed to
>> propogate can cause premature exhaustion of an iterator.
>
> Docs fixing doesn't solve everything.
True, but docs fixing is always backwards compatible :)
>
>> If something must be done then I would suggest changing the behaviour of
>> next() for an exhausted iterator.
>> Rather than raise StopIteration it should raise ValueError (or IndexError?).
>
> So, if I've understood you correctly, what you're saying is that
> __next__ should raise StopIteration, and then next() should absorb
> that and raise ValueError instead? I'm not sure how this would help
> anything, but I can see that it would poke the issue with a sharp
> pointy stick. Can you elaborate on how this would work in practice?
How would it help? It would prevent propagation of StopIteration causes
premature exhaustion of an iterator. That is what the PEP is about, 
isn't it?
>
>> Also, it might be worth considering making StopIteration inherit from
>> BaseException, rather than Exception.
>
> Separate concern altogether, as the bases of StopIteration have
> nothing to do with a protocol meaning collision. I would probably
> support this change, on the basis that Exception should be for, well,
> exceptions, and BaseException can be used for everything that uses the
> exception-handling mechanism for other purposes. But it wouldn't help
> or affect this proposal.
Agreed.

Cheers,
Mark.


More information about the Python-Dev mailing list