[Python-Dev] PEP 479: Change StopIteration handling inside generators

Sun Nov 23 02:30:57 CET 2014

On Sun, Nov 23, 2014 at 12:11 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> The worry is that your proposal intentionally breaks that code which is
> currently
> bug free, clean, fast, stable, and relying on a part of the API that has
> been
> guaranteed and documented from day one.

(I'd just like to mention that this isn't "my proposal", beyond that
I'm doing the summarizing and PEP writing. The proposal itself is
primarily derived from one of Guido's posts on -ideas.)

> Here's one from Fredrick Lundh's ElementTree code in the standard library
> (there are several other examples besides this one in his code are well):
>
>     def iterfind(elem, path, namespaces=None):
>         # compile selector pattern
>         cache_key = (path, None if namespaces is None
>                                 else tuple(sorted(namespaces.items())))
>         if path[-1:] == "/":
>             path = path + "*" # implicit all (FIXME: keep this?)
>         try:
>             selector = _cache[cache_key]
>         except KeyError:
>             if len(_cache) > 100:
>                 _cache.clear()
>             if path[:1] == "/":
>                 raise SyntaxError("cannot use absolute path on element")
>             next = iter(xpath_tokenizer(path, namespaces)).__next__
>             token = next()
>             selector = []
>             while 1:
>                 try:
>                     selector.append(ops[token[0]](next, token))
>                 except StopIteration:
>                     raise SyntaxError("invalid path")
>                 try:
>                     token = next()
>                     if token[0] == "/":
>                         token = next()
>                 except StopIteration:
>                     break
>             _cache[cache_key] = selector
>         # execute selector pattern
>         result = [elem]
>         context = _SelectorContext(elem)
>         for select in selector:
>             result = select(context, result)
>         return result

Most of the next() calls are already guarded with try/except; from
what I can see, only the first one isn't. So this wouldn't be much of
a change here.

> And here is an example from the pure python version of one of the itertools:
>
>     def accumulate(iterable, func=operator.add):
>         'Return running totals'
>         # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
>         # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
>         it = iter(iterable)
>         total = next(it)
>         yield total
>         for element in it:
>             total = func(total, element)
>             yield total

Again, only the first one needs protection, and all that happens is
that there's clearly a control flow possibility here (that the first
"yield total" might not be reached). Currently, *any* function call
has the potential to be a silent control flow event.

> And here is an example from Django:
>
>     def _generator():
>         it = iter(text.split(' '))
>         word = next(it)
>         yield word
>         pos = len(word) - word.rfind('\n') - 1
>         for word in it:
>             if "\n" in word:
>                 lines = word.split('\n')
>             else:
>                 lines = (word,)
>             pos += len(lines[0]) + 1
>             if pos > width:
>                 yield '\n'
>                 pos = len(lines[-1])
>             else:
>                 yield ' '
>                 if len(lines) > 1:
>                     pos = len(lines[-1])
>             yield word
>     return ''.join(_generator())

When you split a string, you're guaranteed at least one result, ergo
'it' is guaranteed to yield at least one word. So this one wouldn't
need to be changed - it can't possibly raise RuntimeError.

> I could scan for even more examples, but I think you get the gist.
> All I'm asking is that you consider that your proposal will do more
> harm than good.  It doesn't add any new capability at all.
> It just kills some code that currently works.

I have considered it, and I'm not convinced that it will. I see lots
of people saying "code will have to be changed", but that's exactly
the same concern that people raise about switching from the sloppy Py2
merging of text and bytes to the strict Py3 separation - yes, code has
to be changed, but it's definitely much better to have immediate and
obvious failures when something's wrong than to have subtle behavioral
changes.

ChrisA