[Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

Nick Coghlan ncoghlan at gmail.com
Wed Jun 27 08:27:30 EDT 2018


On 26 June 2018 at 02:27, Guido van Rossum <guido at python.org> wrote:
> [This is my one reply in this thread today. I am trying to limit the amount
> of time I spend to avoid another overheated escalation.]

Aye, I'm trying to do the same, and deliberately spending some
evenings entirely offline is helping with that :)

> On Mon, Jun 25, 2018 at 4:44 AM Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> Right, the proposed blunt solution to "Should I use 'NAME = EXPR' or
>> 'NAME := EXPR'?" bothers me a bit, but it's the implementation
>> implications of parent local scoping that I fear will create a
>> semantic tar pit we can't get out of later.
>
> Others have remarked this too, but it really bother me that you are focusing
> so much on the implementation of parent local scoping rather than on the
> "intuitive" behavior which is super easy to explain -- especially to someone
> who isn't all that familiar (or interested) with the implicit scope created
> for the loop control variable(s). According to Steven (who noticed that this
> is barely mentioned in most tutorials about comprehensions) that is most
> people, however very few of them read python-dev.
>
> It's not that much work for the compiler, since it just needs to do a little
> bit of (new) static analysis and then it can generate the bytecode to
> manipulate closure(s). The runtime proper doesn't need any new
> implementation effort. The fact that sometimes a closure must be introduced
> where no explicit initialization exists is irrelevant to the runtime -- this
> only affects the static analysis, at runtime it's no different than if the
> explicit initialization was inside `if 0`.

One of the things I prize about Python's current code generator is how
many of the constructs can be formulated as simple content-and-context
independent boilerplate removal, which is why parent local scoping (as
currently defined in PEP 572) bothers me: rather than being a new
primitive in its own right, the PEP instead makes the notion of "an
assignment expression in a comprehension or generator expression" a
construct that can't readily decomposed into lower level building
blocks the way that both assignment expressions on their own and
comprehensions and generator expressions on their own can be. Instead,
completely new language semantics arise from the interaction between
two otherwise independent features.

Even changes as complicated as PEP 343's with statement, PEP 380's
yield from, and PEP 492's native coroutines all include examples of
how they could be written *without* the benefit of the new syntax.

By contrast, PEP 572's parent local scoping can't currently be defined
that way. Instead, to explain how the code generator is going to be
expected to handle comprehensions, you have to take the current
comprehension semantics and add two new loops to link up the bound
names correctly::

    [item := x for x in items]

becomes:

    # Each bound name gets declared as local in the parent scope
    if 0:
        for item in (): pass
    def _list_comp(_outermost_iter):
        # Each bound name gets declared as:
        #   - nonlocal if outer scope is a function scope
        #   - global item if outer scope is a module scope
        #   - an error, otherwise
        _result = []
        for x in _outermost_iter:
            _result.append(x)
        return _result

    _expr_result = _list_comp(items)

This is why my objections would be reduced significantly if the PEP
explicitly admitted that it was defining a new kind of scoping
semantics, and actually made those semantics available as an explicit
"parentlocal NAME" declaration (behind a "from __future__ import
parent_locals" guard), such that the translation of the above example
to an explicitly nested scope could just be the visually
straightforward::

    def _list_comp(_outermost_iter):
        parentlocal item
        _result = []
        for x in _outermost_iter:
            item = x
            _result.append(x)
        return _result

    _expr_result = _list_comp(items)

That splits up the learning process for anyone trying to really
understand how this particular aspect of Python's code generation
works into two distinct pieces:

- "assignment expressions inside comprehensions and generator
expressions use parent local scoping"
- "parent local scoping works <the way that PEP 572 defines it>"

If the PEP did that, we could likely even make parent locals work
sensibly for classes by saying that "parent local" for a method
definition in a class body refers to the closure namespace where we
already stash __class__ references for the benefit of zero-arg super
(this would also be a far more robust way of defining private class
variables than name mangling is able to offer).

Having parent locals available as a language level concept (rather
than solely as an interaction between assignment expressions and
implicitly nested scopes) also gets us to a point where
context-independent code thunks that work both at module level and
inside another function can be built as nested functions which declare
all their working variables as parentlocal (you still need to define
the thunks inline in the scope you want them to affect, since this
isn't dynamic scoping, but when describing the code, you don't need to
say "as a module level function define it this way, as a nested
function define it that way").

An explicit "parentlocal NAME" concept at the PEP 572 layer would also
change the nature of the draft "given" proposal from competing with
PEP 572, to instead being a follow-up proposal that focused on
providing control of target name declarations in lambda expressions,
comprehensions, and generator expressions such that:

- (lambda arg: value := arg given parentlocal value) # Exports "value"
to parent scope
- any(x for x in items given parentlocal x) # Exports "x" to parent scope
- [y for x in data if (y := f(x)) given y] # *Avoids* exporting "y" to
parent scope

With parent local scoping in the mix the proposed "given" syntax could
also dispense with initialiser and type hinting support entirely and
instead only allow:

- "... given NAME" (always local, no matter the default scoping)
- "... given parentlocal NAME" (always parent local, declaring if necessary)
- "... given nonlocal NAME" (always nonlocal, error if not declared in
outer scope)
- "... given global NAME" (always global, no matter how nested the
current scope is)
- "... given (TARGET1, TARGET2, ...)" (declaring multiple assignment targets)

If you want an initialiser or a type hint, then you'd use parentlocal
semantics. If you want to keep names local (e.g. to avoid exporting
them as part of a module's public API) then you can do that, too.

>> Unfortunately, I think the key rationale for (b) is that if you
>> *don't* do something along those lines, then there's a different
>> strange scoping discrepancy that arises between the non-comprehension
>> forms of container displays and the comprehension forms:
>>
>>     (NAME := EXPR,) # Binds a local
>>     tuple(NAME := EXPR for __ in range(1)) # Doesn't bind a local
>> [...]
>> Those scoping inconsistencies aren't *new*, but provoking them
>> currently involves either class scopes, or messing about with
>> locals().
>
> In what sense are they not new? This syntax doesn't exist yet.

The simplest way to illustrate the scope distinction today is with
"len(locals())":

    >>> [len(locals()) for i in range(1)]
    [2]
    >>> [len(locals())]
    [7]

But essentially nobody ever does that, so the distinction doesn't
currently matter.

By contrast, where assignment expressions bind their targets matters a
*lot*, so PEP 572 makes the existing scoping oddities a lot more
significant.

> You left out another discrepancy, which is more likely to hit people in the
> face: according to your doctrine, := used in the "outermost iterable" would
> create a local in the containing scope, since that's where the outermost
> iterable is evaluated. So in this example
>
>     a = [x := i+1 for i in range(y := 2)]
>
> the scope of x would be the implicit function (i.e. it wouldn't leak) while
> the scope of y would be the same as that of a. (And there's an even more
> cryptic example, where the same name is assigned in both places.)

Yeah, the fact it deals with this problem nicely is one aspect of the
parent local scoping that I find genuinely attractive.

>> Parent local scoping tries to mitigate the surface inconsistency by
>> changing how write semantics are defined for implicitly nested scopes,
>> but that comes at the cost of making those semantics inconsistent with
>> explicitly nested scopes and with the read semantics of implicitly
>> nested scopes.
>
>
> Nobody thinks about write semantics though -- it's simply not the right
> abstraction to use here, you've introduced it because that's how *you* think
> about this.

The truth of the last part of that paragraph means that the only way
for the first part of it to be true is to decide that my way of
thinking is *so* unusual that nobody else in the 10 years that Python
3 has worked the way it does now has used the language reference, the
source code, the disassembler, or the debugger to formulate a similar
mental model of how they expect comprehensions and generator
expressions to behave.

I'll grant that I may be unusual in thinking about comprehensions and
generator expressions the way I do, and I definitely accept that most
folks simply don't think about the subtleties of how they handle
scopes in the first place, but I *don't* accept the assertion that I'm
unique in thinking about them that way. There are simply too many edge
cases in their current runtime behaviour where the "Aha!" moment at
the end of a debugging effort is going to be the realisation that
they're implemented as an implicitly nested scope, and we've had a
decade of Python 3 use where folks prone towards writing overly clever
comprehensions have been in a position to independently make that
discovery.

>> The early iterations of PEP 572 tried to duck this whole realm of
>> potential semantic inconsistencies by introducing sublocal scoping

> There was also another variant in some iteration or PEP 572, after sublocal
> scopes were already eliminated -- a change to comprehensions that would
> evaluate the innermost iterable in the implicit function. This would make
> the explanation of inline assignment in comprehensions consistent again
> (they were always local to the comprehension in that iteration of the PEP),
> at the cost of a backward incompatibility that was ultimately withdrawn.

Yeah, the current "given" draft has an open question around the idea
of having the presence of a "given" clause pull the outermost iterable
evaluation inside the nested scope. It still doesn't really solve the
problem, though, so I think I'd actually consider
PEP-572-with-explicit-parent-local-scoping-support the version of
assignment expressions that most cleanly handles the interaction with
comprehension scopes without making that interaction rely on opaque
magic (instead, it would be relying on an implicit target scope
declaration, the same as any other name binding - the only unusual
aspect is that the implicit declaration would be "parentlocal NAME"
rather than the more typical local variable declaration).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list