Callable generators (PEP 288: Generator Attributes, again)

Francis Avila francisgavila at yahoo.com
Tue Nov 18 11:11:48 CET 2003


A little annoyed one day that I couldn't use the statefulness of
generators as "resumable functions", I came across Hettinger's PEP 288
(http://www.python.org/peps/pep-0288.html, still listed as open, even
though it's at least a year old and Guido doesn't seem very hot on the
idea).  I'm not too sure of its ideas on raising exceptions in
generators from outside (although it looks like it might be convenient
in some cases), but being able to pass names into generators is
definitely something that seems natural and worth having.  There's
currently no easy way to statefully consume an iterator/generator.

I'm sorry if I'm raising a dead issue, but it *is* still "Draft", and
with generator expressions on the way, it might be worth bringing up
again.

I'm mostly developing the ideas found in this thread, started by Bengt
Richter nearly a year ago:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=asjh27%2480v%240%40216.39.172.122&rnum=1&prev=/groups%3Fq%3Dgenerator%2Bgroup%253Acomp.lang.python%26ie%3DISO-8859-1%26hl%3Den
(If that's too much, google 'generator group:comp.lang.python')

Also on python-dev:
http://www.python.org/dev/summary/2002-11-16_2002-11-30.html
(The thread fizzled out without any clear resolution.)

The basic thing I picked up on the threads above is that generator
attributes are too class-like for generators, and that value-passing
to generators should have more function-like semantics. Justification,
and proposed syntax/semantics follows.

Beware: hand-waving generalizations follow.

In the various wranglings in this and other threads about the
semantics of passing values into generators, one thing that didn't get
mentioned much was to think of generators as "resumable functions"
instead of "simple class instances".  The idea of using generator
attributes is the "generators as class instances" view, and as many
mentioned, the semantics don't quite fit.  If we think of generators
as resumable functions, a lot of more natural value-passing semantics
suggest themselves.

Functions have a parameter list, which assigns values to locals, does
something to those locals, and then returns something.  This parameter
list has some rich functionality, using default args and keyword args,
but basically that's all a function is.  This is in contrast to
classes, which are not only stateful, but have multiple methods (which
can be called in any order) and make constant use of their own
attributes (which functions can't easily access at all).  Classes have
much more free-form control flow.  Functions model algorithms, whereas
classes model things.

Like class instances, generators are stateful, have an initialization,
and are produced by a factory (the function definition with a yield in
the body for generators, the class object for instances).  In all
other respects, however, I think they're much more like functions.
Like functions, generators have (conceptually) no methods: they just
continuously return.  Also like functions, they generally model an
algorithm, not a thing.  Also, they can't access their own attributes.

Generator initialization is already modeled (the parameter list of the
function definition, and the body of the function up to the first
yield).  Generator returning is, too (the next() method).  It's the
function's callability that is *not* modeled.

Functions dump passed values into a local namespace according to a
parameter list.  Let's make generators do the same:

def echo():
    while True:
        yield something

# Add some syntactic sugar to do the following if you like.
echo.parameter_list = 'something=1'

# Here's an idle suggestion:
#           def echo() (something=1): ...
# I.e.: def <name> ( <initial_parameter_list> ) (
<consumer_param_list> ):

# Although really I think generator definitions should have gotten
their own
# keyword instead of overloading 'def'.  Oh well.

cecho = echo()
#cecho.next() raises NameError here.
cecho()  # yields 1
cecho(10) # yields 10
# argument default overwrites 'something' in the local namespace:
cecho()  # yields 1
cecho('abc') # yields 'abc'
# next() bypasses function-like locals() updating:
cecho.next() # yields 'abc'

I.e.: If called as a function, update the local namespace, then yield.
      If called as an iterator, just yield.

As things are now, this is easier said than done, of course: the local
namespace of generators (functions, too) is read-only.  (Could anyone
go in to the reasons for this?  Is it an arbitrary restriction for
safety/optimization/sanity, or a severe implementation limitation, or
what?)

Here is a pie-in-the-sky implementation of the above semantics (which
looks an awful lot like currying):

class consumer(object):
    """consumer(func, *args, **kargs) -> callable generator

    func must return a generator
    func.parameter_list gives calling semantics of the generator.

    """
    def __init__(self, func, *args, **kargs):
        self._gen = func(*args, **kargs)

        # The following check isn't very useful because generator-like
        # things don't have a common base class.  What we really need
        # to do is check co_flags of func's code object.

        if not isinstance(self._gen, types.GeneratorType):
            raise TypeError, "func must return a generator."
        
        try:
            params = func.parameter_list
        except AttributeError:
            params = ''
            
        exec 'def flocals(%s): return locals()' % params
        self.flocals = flocals

    def __call__(self, *args, **kargs):
        # This doesn't give very good error messages.  Ideally, they
        # would be identical to those given by calling functions with
        # badly formed arguments.

        newlocals = self.flocals(*args, **kargs)
        self._gen.gi_frame.f_locals.update(newlocals) #doesn't work
        return self._gen.next()
    def next(self):
            return self._gen.next()
        
(Notice that there's nothing in the above that would *require* the
generator's parameter list remain static...but forget I said that.)

To get around the f_locals read-only problem, I tried recasting func
with new globals (using 'new.function(func.func_code,
self.my_globals, ...)'), which is a dict-like object with an
"in-between" namespace between locals and globals.  I couldn't figure
out how to get __builtins__ lookups to work, and anyway, this approach
ultimately fails if the generator ever assigns to these names in its
local namespace--a pretty serious restriction.  Then generators would
require declaring all such argument names global, or never assigning
to them at all.  And they'd never be able to access like-named
globals.

So, we're back to the __self__ mess, like before (i.e., all arguments
must be attributes of __self__).  At least that can be implemented
fairly easily (there's a recipe at aspn).

Notice that these consumer-generators share something of classes and
something of functions, but fill in the "function" half that simple
generators didn't quite fill. (They only filled the 'class' half by
being stateful.)

As for raising exceptions in a generator from the outside, that I'm
not too sure of, but I can't think of any argument against it.  And
here's one for it:  A consumer-generator can be used as a simple state
machine, which has a start and end condition.  The start condition is
set up by the initial generator-factory call, the run conditions by
the call/yield of the generator, but without being able to raise an
exception in the generator, we need to overload the calling semantics
to signal an end state.  Being able to raise an exception within the
generator would make that cleaner.  Still, I think it's a separate
consideration.

Personally, I'd like to one day be able to do stupid things like this:
>>> e = echo()
>>> tuple(feed(e, xrange(5))
(0, 1, 2, 3, 4)

--
Francis Avila




More information about the Python-list mailing list