[Python-ideas] A comprehension scope issue in PEP 572

Nick Coghlan ncoghlan at gmail.com
Fri May 11 07:15:19 EDT 2018

On 10 May 2018 at 23:47, Tim Peters <tim.peters at gmail.com> wrote:

> ...
> [Guido]
> >> You should really read Tim's initial post in this thread, where he
> >> explains his motivation.
> [Nick]
> > I did, and then I talked him out of it by pointing out how confusing it
> > would be to have the binding semantics of "x := y" be context dependent.
> Ya, that was an effective Jedi mind trick when I was overdue to go to
> sleep ;-)
> To a plain user, there's nothing about a listcomp or genexp that says
> "new function introduced here".  It looks like, for all the world,
> that it's running _in_ the block that contains it.  It's magical
> enough that `for` targets magically become local.  But that's almost
> never harmful magic, and often helpful, so worth it.
> >...
> > It *is* different, because ":=" normally binds the same as any other name
> > binding operation including "for x in y:" (i.e. it creates a local
> > variable), while at comprehension scope, the proposal has now become for
> "x
> > := y" to create a local variable in the containing scope, while "for x
> in y"
> > doesn't.
> ":=" target names in a genexp/listcmp are treated exactly the same as
> any other non-for-target name:  they resolve to the same scope as they
> resolve to in the block that contains them.  The only twist is that if
> such a name `x` isn't otherwise known in the block, then `x` is
> established as being local to the block (which incidentally also
> covers the case when the genexp/listcomp is at module level, where
> "local to the block" and "global to the block" mean the same thing).
> Class scope may be an exception (I cheerfully never learned anything
> about how class scope works, because I don't write insane code ;-) ).

That's all well and good, but it is *completely insufficient for the
language specification*. For the language spec, we have to be able to tell
implementation authors exactly how all of the "bizarre edge case" that
you're attempting to hand wave away should behave by updating
appropriately. It isn't 1995 any more - while CPython is still the
reference implementation for Python, we're far from being the only
implementation, which means we have to be a lot more disciplined about how
much we leave up to the implementation to define.

The expected semantics for locals() are already sufficiently unclear that
they're a source of software bugs (even in CPython) when attempting to run
things under a debugger or line profiler (or anything else that sets a
trace function). See https://www.python.org/dev/peps/pep-0558/ for details.

"Comprehension scopes are already confusing, so it's OK to dial their
weirdness all the way up to 11" is an *incredibly* strange argument to be
attempting to make when the original better defined sublocal scoping
proposal was knocked back as being overly confusing (even after it had been
deliberately simplified by prohibiting nonlocal access to sublocals).

Right now, the learning process for picking up the details of comprehension
scopes goes something like this:

* make the
assumption that "[x for x in data]" is semantically equivalent to a for
loop (especially common for experienced Py2 devs where this really was the

    _result = []
    for x in data:

* discover that "[x for x in data]" is actually semantically equivalent to
"list(x for x in data)" (albeit without the name lookup and optimised to
avoid actually creating the generator-iterator)
* make the still-technically-incorrect-but-even-more-reliable assumption
that the generator expression "(x for x in data)" is equivalent to

    def _genexp():
        for x in data:
            yield x

    _result = _genexp()

* *maybe* discover that even the above expansion isn't quite accurate, and
that the underlying semantic equivalent is actually this (one way to
discover this by accident is to have a name error in the outermost iterable

    def _genexp(_outermost_iter):
        for x in _outermost_iter:
            yield x

    _result = _genexp(_outermost_iter)

* and then realise that the optimised list comprehension form is
essentially this:

    def _listcomp(_outermost_iter):
        result = []
        for x in _outermost_iter:
        return result

    _result = _listcomp(data)

Now that "yield" in comprehensions has been prohibited, you've learned all
the edge cases at that point - all of the runtime behaviour of things like
name references, locals(), lambda expressions that close over the iteration
variable, etc can be explained directly in terms of the equivalent
functions and generators, so while comprehension iteration variable hiding
may *seem* magical, it's really mostly explained by the deliberate semantic
equivalence between the comprehension form and the constructor+genexp form.
(That's exactly how PEP 3100 describes the change: "Have list
comprehensions be syntactic sugar for passing an equivalent generator
expression to list(); as a consequence the loop variable will no longer be

As such, any proposal to have name bindings behave differently in
comprehension and generator expression scope from the way they would behave
in the equivalent nested function definitions *must be specified to an
equivalent level of detail as the status quo*.

All of the attempts at such a definition that have been made so far have
been riddled with action and a distance and context-dependent compilation

* whether to implicitly declare the binding target as nonlocal or global
depends on whether or not you're at module scope or inside a function
* the desired semantics at class scope have been left largely unclear
* the desired semantics in the case of nested comprehensions and generator
expressions has been left entirely unclear

Now, there *are* ways to resolve these problems in a coherent way, and that
would be to define "parent local scoping" as a new scope type, and
introduce a corresponding "parentlocal NAME" compiler declaration to
explicitly request those semantics for bound names (allowing the expansions
of comprehensions and generator expressions as explicitly nested functions
to be adjusted accordingly).

But the PEP will need to state explicitly that that's what it is doing, and
fully specify how those new semantics are expected to work in *all* of the
existing scope types, not just the two where the desired behaviour is
relatively easy to define in terms of nonlocal and global.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180511/b1a1a04c/attachment.html>

More information about the Python-ideas mailing list