[Python-ideas] x=(yield from) confusion [was:Yet another alternative name for yield-from]

Jacob Holm jh at improva.dk
Sun Apr 5 16:54:58 CEST 2009


Hi Guido

I like the way you are building the description up from the simple case, 
but I think you are missing a few details along the way.
Those details are what has been driving the discussion, so I think it is 
important to get them handled.  I'll comment on each point as I get to it.


Guido van Rossum wrote:
> I want to name the new exception ReturnFromGenerator to minimize the 
> similarity with GeneratorExit [...]

Fine with me, assuming we can't get rid of it altogether. 

[Snipped description of close() and __del__(), which I intend to comment 
on in the other thread]

> [Guido]
> >> Oh, and "yield from" competes with @couroutine over
> >> when the initial next() call is made, which again suggests the two
> >> styles (yield-from and coroutines) are incompatible.
> >
> > It is a serious problem, because one of the major points of the PEP 
> is that
> > it should be useful for refactoring coroutines.  As a matter of fact, I
> > started another thread on this specific issue earlier today which 
> only Nick
> > has so far responded to.  I think it is solvable, but requires some more
> > work.
>
> I think that's the thread where I asked you and Nick to stop making 
> more proposals.I a worried that a solution would become too complex, 
> and I want to keep the "naive" interpretation of "yield from EXPR" to 
> be as close as possible to "for x in EXPR: yield x". I think the 
> @coroutine generator (whether built-in or not) or explicit "priming" 
> by a next() call is fine.

I think it is important to be able to use yield-from with a @coroutine, 
but I'll wait a bit before I do more on that front (except for a few 
more comments in this mail).  There are plenty of other issues to tackle.


> So now let me develop my full thoughts on yield-from. This is 
> unfortunately long, because I want to show some intermediate stages. I 
> am using a green font for new code. I am using stages, where each 
> stage provides a better approximation of the desired semantics. Note 
> that each stage *adds* some semantics for corner cases that weren't 
> handled the same way in the previous stage. Each stage proposes an 
> expansion for "RETVAL = yield from EXPR". I am using Py3k syntax.
[snip stage 1-3]
> 4. Stage four adds handling for ReturnFromGenerator, in both places 
> where next() is called:
>
> it = iter(EXPR)
> try:
>   x = next(it)
> except StopIteration:
>   RETVAL = e.value
> except ReturnFromGenerator as e:
>   RETVAL = e.value; break
> else:
>   while True:
>     yield x
>     try:
>       x = next(it)
>     except StopIteration:
>       RETVAL = None; break
>     except ReturnFromGenerator as e:
>       RETVAL = e.value; break
>  yield x

(There are two cut'n'paste errors here.  The first "break" and the 
second "yield x" shouldn't be there.  Just wanted to point it out in 
case this derivation makes it to the PEP)

>
> 5. Stage five shows what should happen if "yield x" above returns a 
> value: it is passed into the subgenerator using send(). I am ignoring 
> for now what happens if it is not a generator; this will be cleared up 
> later. Note that the initial next() call does not change into a send() 
> call, because there is no value to send before before we have yielded:
>
[snipped code for stage 5]

The argument that we have no value to send before we have yielded is 
wrong.  The generator containing the "yield-from" could easily have a 
value to send (or throw), and if iter(EXPR) returns a coroutine or a 
non-generator it could easily be ready to accept it.  That is the idea 
behind my attempted fixes to the @coroutine issue.

> 6. Stage six adds more refined semantics for when "yield x" raises an 
> exception: it is thrown into the generator, except if it is 
> GeneratorExit, in which case we close() the generator and re-raise it 
> (in this case the loop cannot continue so we do not set RETVAL):
>
[snipped code for stage 6]

This is where the fun begins.  In an earlier thread we concluded that if 
the thrown exception is a StopIteration and the *same* StopIteration 
instance escapes the throw() call, it should be reraised rather than 
caught and turned into a RETVAL.  The reasoning was the following example:

def inner():
    for i in xrange(10):
        yield i

def outer():
    yield from inner()
    print "if StopIteration is thrown in we shouldn't get here" 


Which we wanted to be equivalent to:

def outer():
    for i in xrange(10):
        yield i
    print "if StopIteration is thrown in we shouldn't get here" 


The same argument goes for ReturnFromGenerator, so the expansion at this 
stage should be more like:

it = iter(EXPR)
try:
  x = next(it)
except StopIteration:
  RETVAL = None
except ReturnFromGenerator as e:
  RETVAL = e.value
else:
  while True:
    try:
      v = yield x
    except GeneratorExit:
      it.close()
      raise
    except BaseException as e:
      try:
        x = it.throw(e)  # IIRC this includes the correct traceback in 3.x so we don't need to use sys.exc_info
      except StopIteration as r:
        if r is e:
          raise
        RETVAL = None; break
      except ReturnFromGenerator as r:
        if r is e:
          raise
        RETVAL = r.value; break
    else:
      try:
        x = it.send(v)
      except StopIteration:
        RETVAL = None; break
      except ReturnFromGenerator as e:
        RETVAL = e.value; break


Next issue is that the value returned by it.close() is thrown away by 
yield-from.  Here is a silly example:

def inner():
    i = 0
    while True
        try:
            yield
        except GeneratorExit:
            return i
        i += 1

def outer():
    try:
        yield from inner()
    except GeneratorExit:
        # nothing I can write here will get me the value returned from inner()


Also the trivial:

def outer():
    return yield from inner()


Would swallow the return value as well.

I have previously suggested attaching the return value to the (re)raised 
GeneratorExit, and/or saving the return value on the generator and 
making close return the value each time it is called.  We could also 
choose to define this as broken behavior and raise a RuntimeError, 
although it seems a bit strange to have yield-from treat it as an error 
when close doesn't.  Silently having the yield-from construct swallow 
the returned value is my least favored option.

>
> 7. In stage 7 we finally ask ourselves what should happen if it is not 
> a generator (but some other iterator). The best answer seems subtle: 
> send() should degenerator to next(), and all exceptions should simply 
> be re-raised. We can conceptually specify this by simply re-using the 
> for-loop expansion:
>
> it = iter(EXPR)
> if <it is not a generator>:
>   for x in it:
>     yield next(x)
>   RETVAL = None
> else:
>   try:
>     x = next(it)
>   except StopIteration:
>     RETVAL = None
>   except ReturnFromGenerator as e:
>     RETVAL = e.value
>   else:
>     while True:
>       try:
>         v = yield x
>       except GeneratorExit:
>         it.close()
>         raise
>       except:
>         try:
>           x = it.throw(*sys.exc_info())
>         except StopIteration:
>           RETVAL = None; break
>         except ReturnFromGenerator as e:
>           RETVAL = e.value; break
>       else:
>         try:
>           x = it.send(v)
>         except StopIteration:
>           RETVAL = None; break
>         except ReturnFromGenerator as e:
>          RETVAL = e.value; break
>
> Note: I don't mean that we literally should have a separate code path 
> for non-generators. But writing it this way adds the generator test to 
> one place in the spec, which helps understanding why I am choosing 
> these semantics. The entire code of stage 6 degenerates to stage 1 if 
> we make the following substitutions:
>
> it.send(v)               -> next(v)
> it.throw(sys.exc_info()) -> raise
> it.close()               -> pass
>
> (Except for some edge cases if the incoming exception is StopIteration 
> or ReturnFromgenerator, so we'd have to do the test before entering 
> the try/except block around the throw() or send() call.)
>
> We could do this based on the presence or absence of the 
> send/throw/close attributes: this would be duck typing. Or we could 
> use isinstance(it, types.GeneratorType). I'm not sure there are strong 
> arguments for either interpretation. The type check might be a little 
> faster. We could even check for an exact type, since GeneratorType is 
> final. Perhaps the most important consideration is that if EXPR 
> produces a file stream object (which has a close() method), it would 
> not consistently be closed: it would be closed if the outer generator 
> was closed before reaching the end, but not if the loop was allowed to 
> run until the end of the file. So I'm leaning towards only making the 
> generator-specific method calls if it is really a generator.

Like Greg, I am in favor of duck-typing this as closely as possible.  My 
preferred treatment for converting stage 6 to stage 7 goes like this:

x = it.close() -->

  m = getattr(it, 'close', None)
  if m is not None:
      x = it.close()
  else:
      x = None

x = it.send(v) -->

  if v is None:
      x = next(it)
  else:
      try:
          m = it.send
      except AttributeError:
          m = getattr(it, 'close', None)
          if m is not None:
              it.close()  # in this case I think it is ok to ignore the return value
          raise
      else:
          x = m(v)

x = throw(e) -->

  m = getattr(it, 'throw', None)
  if m is not None:
      x = m()
  else:
      m = getattr(it, 'close', None)
      if m is not None:
          it.close()  # in this case I think it is ok to ignore the return value
      raise e


In this version it is easy enough to wrap the final iterator if you want 
different behavior.  With your version it becomes difficult to replace a 
generator that is used in a yield-from with an iterator.  (You would 
have to wrap the iterator in a generator that mostly consisted of the 
expansion from this PEP with the above substitution).

I don't think we need to worry about performance at this stage.  AFAICT 
from the patch I was working on, the cost of a few extra checks is 
negligible compared to the savings you get from using yield-from in the 
first place.

Best regards
- Jacob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090405/50cf5475/attachment.html>


More information about the Python-ideas mailing list