[Python-Dev] Termination of two-arg iter()

Tim Peters tim.one@comcast.net
Sat, 13 Jul 2002 15:50:18 -0400


[Tim]
>> The base question:  does the iteration protocol define what
>> happens if an iterator's next() method is called after the iterator
>> has raised StopIteration?  Or is that left up to the discretion of the
>> iterator?

[Guido]
> The latter.  Believe it or not, I thought about this during the design
> of the protocol, and decided that if someone wanted to create an
> iterator that could somehow continue after raising StopIteration, that
> should be their problem.

I believe it -- I even vaguely recall discussions about it.  Unfortunately,
they don't seem to be recorded anywhere I can find now.

> Basically, the effect of calling next() after StopIteration is raised is
> undefined.

That's consistent with a lawyer's reading of the relevant PEP <wink>.

>> If the answer is that it's the iterator's choice, is 2-argument iter()
>> making the best choice?  The rub here is that 2-arg iter was (IMO)
>> introduced to help iteration-ignorant callables fit into the iteration
>> protocol, and *because* they're iteration-ignorant they may do something
>> foolish if called again after their "sentinel" value is seen.

> If the caller stops calling next(), nothing's wrong.  I don't think
> the callable-iterator object should grow another state bit.
>
> But I'm willing to be convinced by information you withheld.

I entered the c.l.py report as a bug against re.  The user provoked re into
hanging via using re's new finditer() method.  The connection to 2-arg iter
is buried in re's C implementation, via PyCallIter_New.  I didn't think it
added anything useful here to spell all that out.

It turns out (and unsurprisingly so with hindsight) that re can be provoked
into the same bad behavior without involving the iteration protocol at all,
so in this case I think finditer() just made it easier to expose a flaw that
was present regardless.  I'm happy to leave this be:  the docs match the
implemenation, I'm sure *someone* relies on that by now, and the behavior is
easy to explain as-is.