[Python-ideas] PEP 572: Statement-Local Name Bindings, take three!

Sun Mar 25 03:00:37 EDT 2018

On 25 March 2018 at 15:34, Guido van Rossum <guido at python.org> wrote:
>
> This is a super complex topic. There are at least three separate levels of critique possible, and all are important.
>
> First there is the clarity of the PEP. Steven D'Aprano has given you great detailed feedback here and you should take it to heart (even if you disagree with his opinion about the specifics). I'd also recommend treating some of the "rejected alternatives" more like "open issues" (which are to be resolved during the review and feedback cycle). And you probably need some new terminology -- the abbreviation SNLB is awkward (I keep having to look it up), and I think we need a short, crisp name for the new variable type.

I've used "ephemeral name binding" before, but that's even longer than
saying ess-ell-enn-bee (for Statement Local Name Binding), and also
doesn't feel right for a proposal that allows the binding to persist
for the entire suite in compound statements.

Given the existing namespace stack of
builtin<-global<-nonlocal<-local, one potential short name would be
"sublocal", to indicate that these references are even more local than
locals (they're *so* local, they don't even appear in locals()!).

> Then there is the issue of syntax. While `(f() as x)` is a cool idea (and we should try to recover who deserves credit for first proposing it),

I know I first suggested it years ago, but I don't recall if anyone
else proposed it before me.

> it's easy to overlook in the middle of an exception.

That I agree with - the more examples I've seen using it, the less
I've liked how visually similar "(a as b)" is to "(a and b)".

> It's arguably more confusing because the scoping rules you propose are so different from the existing three other uses of `as NAME` -- and it causes an ugly wart in the PEP because two of those other uses are syntactically so close that you propose to ban SNLBs there. When it comes to alternatives, I think we've brainwashed ourselves into believing that inline assignments using `=` are evil that it's hard to objectively explain why it's bad -- we're just repeating the mantra here. I wish we could do more quantitative research into how bad this actually is in languages that do have it. We should also keep an open mind about alternative solutions present in other languages. Here it would be nice if we had some qualitative research into what other languages actually do (both about syntax and about semantics, for sure).

Writing "name = expr" when you meant "name == expr" remains a common
enough source of bugs in languages that allow it that I still wouldn't
want to bring that particular opportunity for semantically significant
typos over to Python.

Using "name := expr" doesn't have that problem though (since
accidentally adding ":" is a much harder typo to make than leaving out
"="), and has the added bonus that we could readily restrict the LHS
to single names. I also quite like the way it reads in conditional
expressions:

    value = f() if (f := lookup_function(args)) is not None else default

And if we do end up going with the approach of defining a separate
sublocal namespace, the fact that "n := ..." binds a sublocal, while
"n = ..." and "... as n" both bind regular locals would be clearer
than having the target scope of "as" be context dependent.

> The third issue is that of semantics. I actually see two issues here. One is whether we need a new scope (and whether it should be as weird as proposed). Steven seems to think we don't. I'm not sure that the counter-argument that we're already down that path with comprehension scopes is strong enough.

> The other issue is that, if we decide we *do* need (or want) statement-local scopes, the PEP must specify the exact scope of a name bound at any point in a statement. E.g. is `d[x] = (f() as x)` valid? And what should we do if a name may or may not be bound, as in `if (f(1) as x) or (f(2) as y): g(y)` -- should that be a compile-time error (since we can easily tell that y isn't always defined when `g(y)` is called) or a runtime error (as we do for unbound "classic" locals)? And there are further details, e.g. are these really not allowed to be closures? And are they single-assignment? (Or can you do e.g. `(f(1) as x) + (f(2) as x)`?)

I think this need to more explicitly specify evaluation order applies
regardless of whether we define a sublocal scope or not: expression
level name binding in any form makes evaluation order (and evaluation
scope!) matter in ways that we can currently gloss over, since you
need to be relying on functions with side effects in order to even
observe the differences.

If the expression level bindings are just ordinary locals, it does
open up some potentially interesting order of evaluation testing
techniques, though:

    expected_order = list(range(3))
    actual_order = iter(expected_order)
    defaultdict(int)[(first := next(actual_order)):(second :=
next(actual_order)):(third := next(actual_order))]
    self.assertEqual([first, second, third], expected_order)

With sublocals, you'd need to explicitly promote them to regular
locals to get the same effect:

    expected_order = list(range(3))
    actual_order = iter(expected_order)
    __, first, second, third = defaultdict(int)[(first :=
next(actual_order)):(second := next(actual_order)):(third :=
next(actual_order))], first, second, third
    self.assertEqual([first, second, third], expected_order)

That said, it's debatable whether *either* of those is any clearer for
that task than the status quo of just using list append operations:

    expected_order = list(range(3))
    actual_order = []
    defaultdict(int)[actual_order.append(0):actual_order.append(1):actual_order.append(2)]
    self.assertEqual(actual_order, expected_order)

> I'm not sure if there are still places in Python where evaluation order is unspecified, but I suspect there are (at the very least the reference manual is incomplete in specifying the exact rules, e.g. I can't find words specifying the evaluation order in a slice). We'll need to fix all of those, otherwise the use of local name bindings in such cases would have unspecified semantics (or the evaluation order could suddenly shift when a local name binding was added).

One that surprised me earlier today is that it looks like we never
transferred the generator expression wording about the scope of
evaluation for the outermost iterable over to the sections describing
comprehension evaluation - we only point out that the result
subexpression evaluation and the iteration variable binding happen in
a nested scope. (Although now I'm wondering if there might already be
a docs tracker issue for that, and I just forgot about it)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia