On 10 May 2018 at 23:47, Tim Peters <tim.peters@gmail.com> wrote:

>> You should really read Tim's initial post in this thread, where he
>> explains his motivation.

> I did, and then I talked him out of it by pointing out how confusing it
> would be to have the binding semantics of "x := y" be context dependent.

Ya, that was an effective Jedi mind trick when I was overdue to go to sleep ;-)

To a plain user, there's nothing about a listcomp or genexp that says
"new function introduced here".  It looks like, for all the world,
that it's running _in_ the block that contains it.  It's magical
enough that `for` targets magically become local.  But that's almost
never harmful magic, and often helpful, so worth it.

> It *is* different, because ":=" normally binds the same as any other name
> binding operation including "for x in y:" (i.e. it creates a local
> variable), while at comprehension scope, the proposal has now become for "x
> := y" to create a local variable in the containing scope, while "for x in y"
> doesn't.

":=" target names in a genexp/listcmp are treated exactly the same as
any other non-for-target name:  they resolve to the same scope as they
resolve to in the block that contains them.  The only twist is that if
such a name `x` isn't otherwise known in the block, then `x` is
established as being local to the block (which incidentally also
covers the case when the genexp/listcomp is at module level, where
"local to the block" and "global to the block" mean the same thing).
Class scope may be an exception (I cheerfully never learned anything
about how class scope works, because I don't write insane code ;-) ).

That's all well and good, but it is *completely insufficient for the language specification*. For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case" that you're attempting to hand wave away should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-sets-and-dictionaries appropriately. It isn't 1995 any more - while CPython is still the reference implementation for Python, we're far from being the only implementation, which means we have to be a lot more disciplined about how much we leave up to the implementation to define.

The expected semantics for locals() are already sufficiently unclear that they're a source of software bugs (even in CPython) when attempting to run things under a debugger or line profiler (or anything else that sets a trace function). See https://www.python.org/dev/peps/pep-0558/ for details.

"Comprehension scopes are already confusing, so it's OK to dial their weirdness all the way up to 11" is an *incredibly* strange argument to be attempting to make when the original better defined sublocal scoping proposal was knocked back as being overly confusing (even after it had been deliberately simplified by prohibiting nonlocal access to sublocals).

Right now, the learning process for picking up the details of comprehension scopes goes something like this:

* make the technically-incorrect-but-mostly-reliable-in-the-absence-of-name-shadowing assumption that "[x for x in data]" is semantically equivalent to a for loop (especially common for experienced Py2 devs where this really was the case!):

    _result = []
    for x in data:
* discover that "[x for x in data]" is actually semantically equivalent to "list(x for x in data)" (albeit without the name lookup and optimised to avoid actually creating the generator-iterator)
* make the still-technically-incorrect-but-even-more-reliable assumption that the generator expression "(x for x in data)" is equivalent to

    def _genexp():
        for x in data:
            yield x

    _result = _genexp()

* *maybe* discover that even the above expansion isn't quite accurate, and that the underlying semantic equivalent is actually this (one way to discover this by accident is to have a name error in the outermost iterable expression):

    def _genexp(_outermost_iter):
        for x in _outermost_iter:
            yield x

    _result = _genexp(_outermost_iter)

* and then realise that the optimised list comprehension form is essentially this:

    def _listcomp(_outermost_iter):
        result = []
        for x in _outermost_iter:
        return result

    _result = _listcomp(data)

Now that "yield" in comprehensions has been prohibited, you've learned all the edge cases at that point - all of the runtime behaviour of things like name references, locals(), lambda expressions that close over the iteration variable, etc can be explained directly in terms of the equivalent functions and generators, so while comprehension iteration variable hiding may *seem* magical, it's really mostly explained by the deliberate semantic equivalence between the comprehension form and the constructor+genexp form. (That's exactly how PEP 3100 describes the change: "Have list comprehensions be syntactic sugar for passing an equivalent generator expression to list(); as a consequence the loop variable will no longer be exposed")

As such, any proposal to have name bindings behave differently in comprehension and generator expression scope from the way they would behave in the equivalent nested function definitions *must be specified to an equivalent level of detail as the status quo*.

All of the attempts at such a definition that have been made so far have been riddled with action and a distance and context-dependent compilation requirements:

* whether to implicitly declare the binding target as nonlocal or global depends on whether or not you're at module scope or inside a function
* the desired semantics at class scope have been left largely unclear
* the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear

Now, there *are* ways to resolve these problems in a coherent way, and that would be to define "parent local scoping" as a new scope type, and introduce a corresponding "parentlocal NAME" compiler declaration to explicitly request those semantics for bound names (allowing the expansions of comprehensions and generator expressions as explicitly nested functions to be adjusted accordingly).

But the PEP will need to state explicitly that that's what it is doing, and fully specify how those new semantics are expected to work in *all* of the existing scope types, not just the two where the desired behaviour is relatively easy to define in terms of nonlocal and global.


Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia