[Python-ideas] Changing semantics of for-loop variable (was: Tweaking closures ...)

Nick Coghlan ncoghlan at gmail.com
Fri Sep 30 00:04:47 CEST 2011


On Thu, Sep 29, 2011 at 5:31 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido has asked me to start a new thread for discussing
> this idea.
>
> To recap, instead of trying to come up with some new
> sugar to make the default-argument hack taste slightly
> less bitter, I suggest making a small change to the
> semantics of for-loops:
>
> If the loop variable is referenced from an inner scope,
> instead of replacing the contents of its cell, create
> a *new* cell on each iteration.
>
> Code following the loop would then continue to see the
> last value bound to the loop variable, as now, but
> inner functions would capture different versions of
> it.

That's potentially really nasty from an eval loop point of view.
However, I think there's a way to make it work fairly naturally.

Currently, the compiler takes all of the following names and
effectively mashes them into a flat list: ordinary locals, cell vars
(i.e. locals referenced from inner scopes) and free vars (i.e. names
from outer lexical scopes). While technically 'free vars' in the
normal usage of the term, global and builtin references are handled
separately.

The bytecode for the function then includes the following kinds of operations:

    LOAD/STORE_FAST: work on ordinary locals via a numeric index
    LOAD/STORE_DEREF: work on cells (i.e. cell vars and free vars) via
a numeric index
    LOAD/STORE_GLOBAL: dynamic lookup in the module globals and then
builtins by name

The key point is that the compiler has enough information to figure
all this out at compile time, so any change to loop semantics would
need to work in with that.

To make the loop rebinding work, it would probably be enough to change
for loops and comprehensions to emit a new REPLACE_CELL opcode such
that instead of replacing the contents of the existing cell they
created a *new* cell and replaced the entire cell. Nested scopes from
previous iterations would still have a reference to the old cell but
all future references would see the new cell.

This wouldn't have any impact on ordinary loops, since the new
STORE_CLOSURE would only be used where STORE_DEREF is used currently.
If there aren't any nested scopes involved, then STORE_FAST (or
STORE_NAME at module or class level) would still get used.

However, I'm not sure how we could handle the following pathological case:

def outer():
    i = 0
    def loop():
        nonlocal i
        for i in range(10):
            def inner():
                return i
            yield inner
    def shared():
        return i
    return loop, shared


>>> loop, shared = outer()
>>> [x() for x in [x for x in loop()]]
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
>>> shared()
9

If that inner loop was modified to replace the cells in the
STORE_DEREF case then that final call would return 0 rather than 9.

If we did this, I think we'd have to make reusing a nonlocal reference
as a loop variable a SyntaxError since the two would flatly contradict
each other (one says "share via this existing cell" the other says
"create a new cell on each iteration").

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list