[Python-ideas] PEP 572: Statement-Local Name Bindings, take three!

Sat Mar 24 05:49:08 EDT 2018

On Sat, Mar 24, 2018 at 3:41 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> To keep this a manageable length, I've trimmed vigourously. Apologies in
> advance if I've been too enthusiastic with the trimming :-)
>
> On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote:
>
>> No, I haven't yet. Sounds like a new section is needed. Thing is,
>> there's a HUGE family of C-like and C-inspired languages that allow
>> assignment expressions, and for the rest, I don't have any personal
>> experience. So I need input from people: what languages do you know of
>> that have small-scope name bindings like this?
>
> I don't know if this counts as "like this", but Lua has a do...end block
> that introduces a new scope. Something like this:
>
>     x = 1
>     do:
>        x = 2
>        print(x)  # prints 2
>     end
>     print(x)  # prints 1
>
> I think that's a neat concept, but I'm struggling to think what I would
> use it for.

Okay. I'll leave off for now, but if the split of PEPs happens, I'll
need to revisit that.

>> >     result = (func(x), func(x)+1, func(x)*2)
>>
>> True, but outside of comprehensions, the most obvious response is
>> "just add another assignment statement". You can't do that in a list
>> comp (or equivalently in a genexp or dict comp).
>
> Yes you can: your PEP gives equivalents that work fine for list comps,
> starting with factorising the duplicate code out into a helper function,
> to using a separate loop to get assignment:
>
>     [(spam, spam+1) for x in values for spam in (func(x),)]
>
>     [(spam, spam+1) for spam in (func(x) for x in values)]
>
> They are the equivalent to "just add another assignment statement" for
> comprehensions.

They might be mechanically equivalent. They are not syntactically
equivalent. This PEP is not about "hey let's do something in Python
that's utterly impossible to do". It's "here's a much tidier way to
spell something that currently has to be ugly".

> Strictly speaking, there's never a time that we cannot use a new
> assignment statement. But sometimes it is annoying or inconvenient.
> Consider a contrived example:
>
> TABLE = [
>          alpha,
>          beta,
>          gamma,
>          delta,
>          ...
>          func(omega) + func(omega)**2 + func(omega)**3,
>          ]
>
>
> Yes, I can pull out the duplication:
>
> temp = function(omega)
> TABLE = [
>          alpha,
>          beta,
>          gamma,
>          delta,
>          ...
>          temp + temp**2 + temp**3,
>          ]
>
> but that puts the definition of temp quite distant from its use. So this
> is arguably nicer:
>
> TABLE = [
>          alpha,
>          beta,
>          gamma,
>          delta,
>          ...
>          (func(omega) as temp) + temp**2 + temp**3,
>          ]

Right. Definitely advantageous (and another reason not to go with the
comprehension-specific options).

>> >> Just as function-local names shadow global names for the scope of the
>> >> function, statement-local names shadow other names for that statement.
>> >> (They can technically also shadow each other, though actually doing this
>> >> should not be encouraged.)
>> >
>> > That seems weird.
>>
>> Which part? That they shadow, or that they can shadow each other?
>
> Shadowing themselves.
>
> I'm still not convinced these should just shadow local variables. Of
> course locals will shadow nonlocals, which shadow globals, which shadow
> builtins. I'm just not sure that we gain much (enough?) to justify
> adding a new scope between what we already have:
>
> proposed statement-local
> local
> nonlocal
> class (only during class statement)
> global
> builtins
>
> I think that needs justification by more than just "it makes the
> implementation easier".

Nick has answered this part better than I can, so I'll just say "yep,
read his post". :)

>> Shadowing is the same as nested functions (including comprehensions,
>> since they're implemented with functions); and if SLNBs are *not* to
>> shadow each other, the only way is to straight-up disallow it.
>
> Or they can just rebind to the same (statement-)local. E.g.:
>
> while ("spam" as x):
>     assert x == "spam"
>     while ("eggs" as x):
>         assert x == "eggs"
>         break
>     assert x == "eggs"

That means that sometimes, ``while ("eggs" as x):`` creates a new
variable, and sometimes it doesn't. Why should that be?

If you change the way that "spam" is assigned to x, the semantics of
the inner 'while' block shouldn't change. It creates a subscope, it
uses that subscope, the subscope expires. Curtain comes down. By your
proposal, you have to check whether 'x' is shadowing some other
variable, and if so, what type. By mine, it doesn't matter; regardless
of whether 'x' existed or not, regardless of whether there's any other
x in any other scope, that loop behaves the same way.

Function-local names give the same confidence. It doesn't matter what
names you use inside a function (modulo 'global' or 'nonlocal'
declarations) - they quietly shadow anything from the outside. You
need only care about module names duplicating local names if you
actually need to use both *in the same context*. Same with built-ins;
it's fine to say "id = 42" inside a function as long as you aren't
going to also use the built-in id() function in that exact function.
Code migration is easy.

>> > Why can they not be used in closures? I expect that's going to cause a
>> > lot of frustration.
>>
>> Conceptually, the variable stops existing at the end of that
>> statement. It makes for some oddities, but fewer oddities than every
>> other variant that I toyed with. For example, does this create one
>> single temporary or many different temporaries?
>>
>> def f():
>>     x = "outer"
>>     funcs = {}
>>     for i in range(10):
>>         if (g(i) as x) > 0:
>>             def closure():
>>                 return x
>>             funcs[x] = closure
>
> I think the rule should be either:
>
> - statement-locals actually *are* locals and so behave like locals;
>
> - statement-locals introduce a new scope, but still behave like
>   locals with respect to closures.
>
> No need to introduce two separate modes of behaviour. (Or if there is
> such a need, then the PEP should explain it.)

That would basically mean locking in some form of semantics. For your
first example, you're locking in the rule that "(g(i) as x)" is
exactly the same as "x = g(i)", and you HAVE to then allow that this
will potentially assign to global or nonlocal names as well (subject
to the usual rules). In other words, you have assignment-as-expression
without any form of subscoping. This is a plausible stance and may
soon be becoming a separate PEP.

But for your second, you're locking in the same oddities that a 'with'
block has: that a variable is being "created" and "destroyed", yet it
sticks around for the rest of the function, just in case. It's a
source of some confusion to people that the name used in a 'with'
statement is actually still valid afterwards. Or does it only stick
around if there is a function to close over it?

Honestly, I really want to toss this one into the "well don't do that"
basket, and let the semantics be dictated by simplicity and
cleanliness even if it means that a closure doesn't see that variable.

>> > "SLNB"? Undefined acronym. What is it? I presume it has something to do
>> > with the single-statement variable.
>>
>> Statement-Local Name Binding, from the title of the PEP. (But people
>> probably don't read titles.)
>
> Indeed. In case it isn't obvious, you should define the acronym the
> first time you use it in the PEP.

Once again, I assumed too much of people. Expected them to actually
read the stuff they're discussing. And once again, the universe
reminds me that people aren't like that. Ah well. Will fix that next
round of edits.

>> >> * An SLNB cannot be the target of any form of assignment, including augmented.
>> >>   Attempting to do so will remove the SLNB and assign to the fully-scoped name.
>> >
>> > What's the justification for this limitation?
>>
>> Not having that limitation creates worse problems, like that having
>> "(1 as a)" somewhere can suddenly make an assignment fail. This is
>> particularly notable with loop headers rather than simple statements.
>
> How and why would it fail?

a = (1 as a)

With current semantics, this is equivalent to "a = 1". If assignment
went into the SLNB, it would be equivalent to "pass". Which do you
expect it to do?

> MUST NOT implies that if there is *any* measurable penalty, even a
> nano-second, the feature must be rejected. I think that's excessive.
> Surely a nanosecond cost for the normal case is a reasonable tradeoff
> if it buys us better expressiveness?

Steve, you know how to time a piece of code. You debate these kinds of
points on python-list frequently. Are you seriously trying to tell me
that you could measure a single nanosecond in regular compiling and
running of Python code?

With my current implementation, there is an extremely small cost
during compilation (a couple of checks of a pointer in a structure,
and if it's never changed from its initial NULL, nothing else
happens), and zero cost at run time. I believe that this counts as "no
measurable penalty".

> Beware of talking in absolutes unless you really mean them.
>
> Besides, as soon as you talk performance, the question has to be, which
> implementation?
>
> Of course we don't want to necessarily impose unreasonable performance
> and maintence costs on any implementation. But surely performance
> cost is a quality of implementation issue. It ought to be a matter of
> trade-offs: is the benefit sufficient to make up for the cost?

I don't see where this comes in. Let's say that Jython can't implement
this feature without a 10% slowdown in run-time performance even if
these subscopes aren't used. What are you saying the PEP should say?
That it's okay for this feature to hurt performance by 10%? Then it
should be rightly rejected. Or that Jython is allowed to ignore this
feature? Or what?

ChrisA