[Python-ideas] A comprehension scope issue in PEP 572

Chris Angelico rosuav at gmail.com
Sun May 6 22:02:56 EDT 2018


On Mon, May 7, 2018 at 11:32 AM, Tim Peters <tim.peters at gmail.com> wrote:
> In a different thread I noted that I sometimes want to write code like this:
>
>     while any(n % p == 0 for p in small_primes):
>         # divide p out - but what's p?
>
> But generator expressions hide the value of `p` that succeeded, so I
> can't.  `any()` and `all()` can't address this themselves - they
> merely work with an iterable of objects to evaluate for truthiness,
> and know nothing about how they're computed.  If you want to identify
> a witness (for `any()` succeeding) or a counterexample (for `all()`
> failing), you need to write a workalike loop by hand.
>
> So would this spelling work using binding expressions?
>
>     while any(n % (thisp := p) == 0 for p in small_primes):
>         n //= thisp
>
> So, in my example above, I expect that `thisp` is viewed as being
> local to the created-by-magic lexically nested function implementing
> the generator expression.  `thisp` would be bound on each iteration,
> but would vanish when `any()` finished and the anonymous function
> vanished with it.  I'd get a NameError on "n //= thisp" (or pick up
> whatever object it was bound to before the loop).

You're correct. The genexp is approximately equivalent to:

def genexp():
    for p in small_primes:
        thisp = p
        yield n % thisp == 0
while any(genexp()):
    n //= thisp

With generator expressions, since they won't necessarily be iterated
over immediately, I think it's correct to create an actual nested
function; you need the effects of closures. With comprehensions, it's
less obvious, and what you're asking for might be more plausible. The
question is, how important is the parallel between list(x for x in
iter) and [x for x in iter] ? Guido likes it, and to create that
parallel, list comps MUST be in their own functions too.

> I have a long history of arguing that magically created lexically
> nested anonymous functions try too hard to behave exactly like
> explicitly typed lexically nested functions, but that's the trendy
> thing to do so I always lose ;-)  The problem:  in a magically created
> nested function, you have no possibility to say _anything_ about
> scope; at least when you type it by hand, you can add `global` and/or
> `nonlocal` declarations to more-or-less say what you want.

That's a fair point. But there is another equally valid use-case for
assignment expressions inside list comps:

values = [y + 2 for x in iter if (y := f(x)) > 0]

In this case, it's just as obvious that the name 'y' should be local
to the comprehension, as 'x' is. Since there's no way to declare
"nonlocal y" inside the comprehension, you're left with a small
handful of options:

1) All names inside list comprehensions are common with their
surrounding scope. The comprehension isn't inside a function, the
iteration variable leaks, you can share names easily. Or if it *is*
inside a function, all its names are implicitly "nonlocal" (in which
case there's not much point having the function).

2) All names are local to their own scope. No names leak, and that
includes names made with ":=".

3) Some sort of rule like "iteration variables don't leak, but those
used with := are implicitly nonlocal". Would create odd edge cases eg
[x for x in iter if x := x] and that would probably result in x
leaking.

4) A special adornment on local names if you don't want them to leak

5) A special adornment on local names if you DO want them to leak

6) A combination of #3 and #4: "x := expr" will be nonlocal, ".x :=
expr" will be local, "for x in iter" will be local. Backward
compatible but a pain to explain.

I can't say I'm a fan of any of the complicated ones (3 through 6).
Option #2 is current status - the name binding is part of the
expression, the expression is inside an implicit function, so the name
is bound within the function. Option 1 is plausible, but would be a
backward compatibility break, with all the consequences thereof. It'd
also be hard to implement cleanly with genexps, since they MUST be
functions. (Unless they're an entirely new concept of callable block
that doesn't include its own scope, which could work, but would be a
boatload of new functionality.)

> Since there's no way to explicitly identify the desired scope, I
> suggest that ":=" inside magically created nested functions do the
> more-useful-more-often thing:  treat the name being bound as if the
> binding had been spelled in its enclosing context instead.  So, in the
> above, if `thisp` was declared `global`, also `global` in the genexp;
> if `nonlocal`, also `nonlocal`; else (almost always the case in real
> life) local to the containing code (meaning it would be local to the
> containing code, but nonlocal in the generated function).

Is it really more useful more often?

> No, I didn't have much use for `for` target names becoming magically
> local to invisible nested functions either, but I appreciate that it's
> less surprising overall.  Using ":=" is much more strongly screaming
> "I'm going way out of my way to give a name to this thing, so please
> don't fight me by assuming I need to be protected from the
> consequences of what I explicitly asked for".

Personally, I'd still like to go back to := creating a statement-local
name, one that won't leak out of ANY statement. But the tide was
against that one, so I gave up on it.

ChrisA


More information about the Python-ideas mailing list