[Python-ideas] Possible PEP 380 tweak

Guido van Rossum guido at python.org
Sat Oct 30 17:00:32 CEST 2010


On Fri, Oct 29, 2010 at 11:16 PM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>>
>> On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing <greg.ewing at canterbury.ac.nz>
>> wrote:
>>
>>> and even then you would lose the return value
>>> from the inner generator, which you're probably going to
>>> want.
>>
>> Really? Can you show a realistic use case?
>
> Here's an attempt:
>
>  def variancer():
>    # Compute variance of values sent in (details left
>    # as an exercise)
>
>  def stddever():
>    # Compute standard deviation of values sent in
>    v = yield from variancer()
>    return sqrt(v)

Good.

I have to get a crazy idea off my chest: maybe the collective hang-up
is that GeneratorExit must be special-cased. Let me explore a bit.

Take a binary tree node:

class Node:
  def __init__(self, label, left=None, right=None):
    self.label, self.left, self.right = label, left, right

And an inorder traversal function:

def inorder(node):
  if node:
    yield from inorder(node.left)
    yield node
    yield from inorder(node.right)

This is a nice example, and different from gtally(), and
variance()/stddev(), because of the recursion.

Now let's say we want to design a protocol whereby the consumer of the
nodes yielded by inorder() can ask the traversal to be stopped. With
the code above this is trivial, just call g.close() or throw any other
exception in. But now let's first modify inorder() to also return a
value computed from the nodes traversed so far. For simplicity I'll
use the count:

def inorder(node):
  if not node:
    return 0
  count += yield from inorder(node.left)
  yield node
  count += 1
  count += yield from inorder(node.right)
  return count

How would we stop this enumeration *and* receive a count of the nodes
already enumerated up to that point? Throwing in some exception is the
easiest approach. Let's say we throw EOFError. My first attempt has a
bug:

def inorder(node):
  if not node:
    return 0
  count = 0
  count += yield from inorder(node.left)  # Bug here
  try:
    count += 1
    yield node
  except EOFError:
    return count
  count += yield from inorder(node.right)
  return count

This has the fatal flaw of not responding promptly when the EOFError
is caught by the left subtree, since it returns normally and the
parent doesn't "see" the EOFError: on the way in it's thrown directly
into the first yield-from, on the way out there's no way to
distinguish between a regular return or an interrupted one.

A potential fix is to return two values: an interrupted flag and a
count. But this is pretty ugly (I'm not even going to show the code).

A different approach to fixing this is for the throwing code to keep
throwing EOFError until the generator stops yielding values:

def stop(g):
  while True:
    try:
      g.throw(EOFError)
    except StopIteration as err:
      return err.value

I'm not concerned with the situation where the generator is already
stopped; the EOFError will be bounced out, but that is the caller's
problem, as they shouldn't have attempted to stop an already-stopped
iterator. (Jacob is probably shaking his head here. :-)

This solution doesn't quite work though, because the count returned
will include the nodes that were yielded while the stack of generators
was winding down. My pragmatic solution for this is to change the
protocol so that stopping the generator means that the node yielded
last should not be included in the count. If you envision the caller
to be running a for-loop, think of calling stop() at the top of the
loop rather than at the bottom. (Jacob is now again wondering how
they'd get the count if the iterator runs till completion. :-) We can
do this by modifying inorder() to bump the count after yielding rather
than before:

  try:
    yield node
  except EOFError:
    return count
  count += 1

Now, to get back the semantics of getting the correct count
*including* the last node seen by the caller, we can modify stop() to
advance the generator by one more step:

def stop(g):
  try:
    next(g)
    while True:
      g.throw(EOFError)
  except StopIteration as err:
    return err.value

This works even if g was positioned after the last item to be yielded:
in that case next(g) raises StopIteration. It still doesn't work if we
use a for-loop to iterate through the end (Jacob nods :-) but I say
they shouldn't be doing that, or they can write a little wrapper for
iter() that *does* save the return value from StopIteration. (Okay,
half of me says it would be fine to store it on the generator object.
:-)

[Dramatic pause]

[Drumroll]

What has this got to do with GeneratorExit and g.close()? I propose to
modify g.close() to keep throwing GeneratorExit until the generator
stops yielding values, and then capture the return value from
StopIteration if that is what was raised. The beauty is then that the
PEP 380 expansion can stop special-casing GeneratorExit: it just
treats it as every other exception. And stddev() above works! (If you
worry about infinite loops: you can get those anyway, by putting
"while: True" in an "except GeneratorExit" block. I don't see much
reason to worry more in this case.)

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list