[Python-ideas] A comprehension scope issue in PEP 572

Tim Peters tim.peters at gmail.com
Thu May 10 23:47:16 EDT 2018


>> You should really read Tim's initial post in this thread, where he
>> explains his motivation.

> I did, and then I talked him out of it by pointing out how confusing it
> would be to have the binding semantics of "x := y" be context dependent.

Ya, that was an effective Jedi mind trick when I was overdue to go to sleep ;-)

To a plain user, there's nothing about a listcomp or genexp that says
"new function introduced here".  It looks like, for all the world,
that it's running _in_ the block that contains it.  It's magical
enough that `for` targets magically become local.  But that's almost
never harmful magic, and often helpful, so worth it.

> It *is* different, because ":=" normally binds the same as any other name
> binding operation including "for x in y:" (i.e. it creates a local
> variable), while at comprehension scope, the proposal has now become for "x
> := y" to create a local variable in the containing scope, while "for x in y"
> doesn't.

":=" target names in a genexp/listcmp are treated exactly the same as
any other non-for-target name:  they resolve to the same scope as they
resolve to in the block that contains them.  The only twist is that if
such a name `x` isn't otherwise known in the block, then `x` is
established as being local to the block (which incidentally also
covers the case when the genexp/listcomp is at module level, where
"local to the block" and "global to the block" mean the same thing).
Class scope may be an exception (I cheerfully never learned anything
about how class scope works, because I don't write insane code ;-) ).

> Comprehension scoping is already hard to explain when its just a
> regular nested function that accepts a single argument, so I'm not looking
> forward to having to explain that "x := y" implies "nonlocal x" at
> comprehension scope

It doesn't, necessarily.  If `x` is already known as `global` in the
block, then there's an implied `global x` at comprehension scope.

> (except that unlike a regular nonlocal declaration, it also implicitly makes it a local
> in the immediately surrounding scope).

Only if `x` is otherwise _unknown_ in the block.  If, e.g., `x` is
already known in an enclosing scope E, then `x` also resolves to scope
E in the comprehension.  It is not made local to the enclosing scope
in that case.

I think it's more fruitful to explain the semantics than try to
explain a concrete implementation.  Python's has a "lumpy" scope
system now, with hard breaks among global scopes, class scopes, and
all other lexical scopes.  That makes implementations artificially
noisy to specify.  "resolve to the same scope as they resolve to in
the block that contains them, with a twist ..." avoids that noise
(e.g.,  the words "global" and "nonlocal" don't even occur), and gets
directly to the point:  in which scope does a name live?  If you think
it's already clear enough which  scope `y` resolves to in

    z = (x+y for x in range(10))

then it's exactly as clear which scope `y` resolves to in

    z = (x + (y := 7) for x in range(10))

with the twist that if `y` is otherwise unknown in the containing
block, `y` becomes local to the block.

> It isn't reasonable to wave this away as "It's only confusing to Nick
> because he's intimately familiar with how comprehensions are implemented",

As above, though, I'm gently suggesting that being so intimately
familiar with implementation details may be interfering with seeing
how all those details can _obscure_ rather than illuminate.  Whenever
you think you need to distinguish between, e.g., "nonlocal" and
"global", you're too deep in the detail weeds.

> as I also wrote some of the language reference docs for the current (already
> complicated) comprehension scoping semantics, and I can't figure out how
> we're going to document the proposed semantics in a way that will actually
> be reasonably easy for readers to follow.

Where are those docs?  I expect to find such stuff in section 4
("Execution model") of the Language Reference Manual, but listcomps
and genexps are only mentioned in passing once in the 3.6.5 section 4
docs, just noting that they don't always play well at class scope.

> ...
> - for comprehensions at class scope, the class scope is ignored for purposes
> of determining the target binding scope (and hence will implicitly create a
> new global variable when used in a top level class definition, and new
> function local when used in a class definition nested inside a function)

Isn't all of that too covered by "resolve to the same scope as they
resolve to in the block that contains them .."?  For example, in

    class K:

at module level, `g` obviously refers to the global `g`.  Therefore
any `g` appearing as a ";=" target in an immediately contained
comprehension also refers to the global `g`, exactly the same as if
`g` were any other non-for-target name in the comprehension.   That's
not a new rule:  it's a consequence of how class scopes already work.
Which remain inscrutable to me ;-)

> ...;
> P.S. None of the above concerns apply to explicit inline scope declarations,
> as those are easy to explain by saying that the inline declarations work the
> same way as the scope declaration statements do, and can be applied
> universally to all name binding operations rather than being specific to ":=
> in comprehension scope".

You already know I'd be happy with being explicit too, but Guido
didn't like it.  Perhaps he'd like it better if it were even _more_
like regular declarations.  Off the top of my head, say that a
comprehension could start with a new optional declaration section,

    def f():
        g = 12
        i = 8
       genexp = (<global g; nonlocal i> g + (j := i*2) for i in range(2))

Of course that's contrived.  When the genexp ran, the `g` would refer
to the global `g` (and the f-local `g` would be ignored); the
local-to-f `i` would end up bound to 1, and in this "all bindings are
local by default" world the ":=" binding to `j` would simply vanish
when the genexp ended.

In practice, I'd be amazed to see anything much fancier than

    p = None  # annoying but worth it ;-)  that is, in this world the
intended scope
                     # for a nonlocal needs to be explicitly established
    while any((<nonlocal p> n % p == 0 for p in small_primes)):
        n //= p

Note too:  a binding expression (":=") isn't even needed then for this
class of use case.

OTOH, it's inexplicable _unless_ someone learns something about how a
synthetic function is being created to implement the genexp.

More information about the Python-ideas mailing list