[Python-ideas] Local only statement mapping with "given/get" blocks.

Ron Adam ron3200 at gmail.com
Mon Oct 17 22:06:21 CEST 2011


On Mon, 2011-10-17 at 22:05 +1000, Nick Coghlan wrote:

> On Mon, Oct 17, 2011 at 5:52 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Nick Coghlan wrote:
> >>
> >> Yeah, that's a large part of why I now think the given clause needs to
> >> be built on the same semantics that we already use internally for
> >> implicit out of order evaluation (i.e. decorators, comprehensions and
> >> generator expressions), such that it merely exposes the unifying
> >> mechanic underlying existing constructs rather than creating a
> >> completely new way of doing things.
> >
> > I'm not sure what you mean by that. If you're talking about
> > the implementation, all three of those use rather different
> > underlying mechanics. What exactly do you see about these
> > that unifies them?
> 
> Actually, comprehensions and generator expressions are almost
> identical in 3.x (they only differ in the details of the inner loop in
> the anonymous function).
> 
> For comprehensions, the parallel with the proposed given statement
> would be almost exact:
> 
>     seq = [x*y for x in range(10) for y in range(5)]
> 
> would map to:
> 
>     seq = _list_comp given _outermost_iter = range(10):
>         _list_comp = []
>         for x in _outermost_iter:
>             for y in range(5):
>                 _list_comp.append(x*y)



Ok, here's a way to look at this that I think you will find interesting.


It looks to me that the 'given' keyword is setting up a local name_space
in the way it's used.  So rather than taking an expression, maybe it
should take a mapping.  (which could be from an expression)


    mapping = dict(iter1=range(10), iter2=range(5))
    given mapping:
        # mapping as local scope
	list_comp=[]
        for x in iter1:
            for y in iter2:
                list_comp.append(x*y)
    seq = mapping['list_comp']

(We could stop here.)

This doesn't do anything out of order.  It shows that statement local
name space, and the out of order assignment are two completely different
things. But let's continue...


Suppose we use a two suite pattern to make getting values out easier.

    mapping = dict(iter1=range(10), iter2=range(5))
    given mapping:
        list_comp=[]
        for x in iter1:
            for y in iter2:
                list_comp.append(x*y)
    get:
        list_comp as seq         # seq = mapping['list_comp']
        
That saves us from having to refer to 'mapping' multiple times,
especially if we need to get a lot of values from it.


So now we can change the above to ...

    given dict(iter1=range(10), iter2=range(5)):
	list_comp=[]
        for x in iter1:
            for y in iter2:
                list_comp.append(x*y)
    get:
        list_comp as seq


And then finally put the 'get' block first.

    get:
        list_comp as seq
    given dict(iter1=range(10), iter2=range(5)):
        list_comp=[]
        for x in iter1:
            for y in iter2:
                list_comp.append(x*y)

Which is very close to the example you gave above, but more readable
because it puts the keywords in the front.  That also makes it more like
a statement than an expression.

Note, that if you use a named mapping with given, you can inspect it
after the given block is done, and/or reuse it multiple times.  I think
that will be very useful for unittests.


This creates a nice way to express some types of blocks that have local
only names in pure python rather than just saying it's magic dust
sprinkled here and there to make it work like that.

(That doesn't mean we should actually change those, but the semantics
could match.)


> And similarly for set and dict comprehensions:
> 
>     # unique = {x*y for x in range(10) for y in range(5)}
>     unique = _set_comp given _outermost_iter = range(10):
>         _set_comp = set()
>         for x in _outermost_iter:
>             for y in range(5):
>                 _set_comp.add(x*y)



    get:
        set_comp as unique
    given dict(iter1=range(10), iter2=range(5)):
        set_comp = set()
        for x in iter1:
            for y in iter2:
                set_comp.add(x, y)



>     # map = {(x, y):x*y for x in range(10) for y in range(5)}
>     map = _dict_comp given _outermost_iter = range(10):
>         _anon = {}
>         for x in _outermost_iter:
>             for y in range(5):
>                 _anon[x,y] = x*y



    get:
        dict_comp as map
    given dict(iter1=range(10), iter2=range(5)):
        dict_comp = {}
        for x in iter1:
            for y in iter2:
                dict_comp[x] = y


I'm not sure if I prefer the "get" block first or last.

    given dict(iter1=range(10), iter2=range(5)):
        dict_comp = {}
        for x in iter1:
            for y in iter2:
                dict_comp[x] = y
    get:
        dict_comp as map

But the order given/get order is a detail you can put to a final vote at
some later time.



> Note that this lays bare some of the quirks of comprehension scoping -
> at class scope, the outermost iterator expression can sometimes see
> names that the inner iterator expressions miss.
> 
> For generator expressions, the parallel isn't quite as strong, since
> the compiler is able to avoid the redundant anonymous function
> involved in the given clause and just emit an anonymous generator
> directly. However, the general principle still holds:
> 
>     # gen_iter = (x*y for x in range(10) for y in range(5))
>     gen_iter = _genexp() given _outermost_iter = range(10):
>         def _genexp():
>             for x in _outermost_iter:
>                 for y in range(5):
>                     yield x*y




    given dict(iter1=range(10), iter2=range(5)):
        def genexp():
            for x in iter1:
                for y in iter2:
                    yield x*y
    get:
        genexp as gen_iter



Interestingly, if we transform the given blocks a bit more we get
something that is nearly a function.

   given Signature(<signature>).bind(mapping):
       ... function body ...
   get:
       ... return values ...


('def' would wrap it in an object, and give it a name.)


So it looks like it has potential to unify some underlying mechanisms as
well as create a nice local only statement space.

What I like about it is that it appears to complement python very well
and doesn't feel like it's something tacked on.  I think having given
take a mapping is what did that for me.


Cheers,
    Ron



> For decorated functions, the parallel is actually almost as weak as it
> is for classes, since so many of the expressions involved (decorator
> expressions, default arguments, annotations) get evaluated in order in
> the current scope and even a given statement can't reproduce the
> actual function statement's behaviour of not being bound at *all* in
> the current scope while decorators are being applied, even though the
> function already knows what it is going to be called:


It's hard to beat a syntax that is only one character long. ;-)



> >>> def call(f):
> ...     print(f.__name__)
> ...     return f()
> ...
> >>> @call
> ... def func():
> ...     return func.__name__
> ...
> func
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 3, in call
>   File "<stdin>", line 3, in func
> NameError: global name 'func' is not defined

>

> So it's really only the machinery underlying comprehensions that is
> being exposed by the PEP rather than anything more far reaching.

>

> Exposing the generator expression machinery directly would require the
> ability to turn the given clause into a generator (via a top level
> yield expression) and then a means to reference that from the header
> line, which gets us back into cryptic and unintuitive PEP 403
> territory. Better to settle for the named alternative.
> 
> Cheers,
> Nick.
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20111017/435321b8/attachment.html>


More information about the Python-ideas mailing list