[Python-ideas] PEP 572: Statement-Local Name Bindings, take three!
Steven D'Aprano
steve at pearwood.info
Sat Mar 24 00:41:02 EDT 2018
To keep this a manageable length, I've trimmed vigourously. Apologies in
advance if I've been too enthusiastic with the trimming :-)
On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote:
> No, I haven't yet. Sounds like a new section is needed. Thing is,
> there's a HUGE family of C-like and C-inspired languages that allow
> assignment expressions, and for the rest, I don't have any personal
> experience. So I need input from people: what languages do you know of
> that have small-scope name bindings like this?
I don't know if this counts as "like this", but Lua has a do...end block
that introduces a new scope. Something like this:
x = 1
do:
x = 2
print(x) # prints 2
end
print(x) # prints 1
I think that's a neat concept, but I'm struggling to think what I would
use it for.
[...]
> > result = (func(x), func(x)+1, func(x)*2)
>
> True, but outside of comprehensions, the most obvious response is
> "just add another assignment statement". You can't do that in a list
> comp (or equivalently in a genexp or dict comp).
Yes you can: your PEP gives equivalents that work fine for list comps,
starting with factorising the duplicate code out into a helper function,
to using a separate loop to get assignment:
[(spam, spam+1) for x in values for spam in (func(x),)]
[(spam, spam+1) for spam in (func(x) for x in values)]
They are the equivalent to "just add another assignment statement" for
comprehensions.
I acknowledge that comprehensions are the motivating example here, but I
don't think they're the only justification for the concept.
Strictly speaking, there's never a time that we cannot use a new
assignment statement. But sometimes it is annoying or inconvenient.
Consider a contrived example:
TABLE = [
alpha,
beta,
gamma,
delta,
...
func(omega) + func(omega)**2 + func(omega)**3,
]
Yes, I can pull out the duplication:
temp = function(omega)
TABLE = [
alpha,
beta,
gamma,
delta,
...
temp + temp**2 + temp**3,
]
but that puts the definition of temp quite distant from its use. So this
is arguably nicer:
TABLE = [
alpha,
beta,
gamma,
delta,
...
(func(omega) as temp) + temp**2 + temp**3,
]
> >> Just as function-local names shadow global names for the scope of the
> >> function, statement-local names shadow other names for that statement.
> >> (They can technically also shadow each other, though actually doing this
> >> should not be encouraged.)
> >
> > That seems weird.
>
> Which part? That they shadow, or that they can shadow each other?
Shadowing themselves.
I'm still not convinced these should just shadow local variables. Of
course locals will shadow nonlocals, which shadow globals, which shadow
builtins. I'm just not sure that we gain much (enough?) to justify
adding a new scope between what we already have:
proposed statement-local
local
nonlocal
class (only during class statement)
global
builtins
I think that needs justification by more than just "it makes the
implementation easier".
> Shadowing is the same as nested functions (including comprehensions,
> since they're implemented with functions); and if SLNBs are *not* to
> shadow each other, the only way is to straight-up disallow it.
Or they can just rebind to the same (statement-)local. E.g.:
while ("spam" as x):
assert x == "spam"
while ("eggs" as x):
assert x == "eggs"
break
assert x == "eggs"
> > Why can they not be used in closures? I expect that's going to cause a
> > lot of frustration.
>
> Conceptually, the variable stops existing at the end of that
> statement. It makes for some oddities, but fewer oddities than every
> other variant that I toyed with. For example, does this create one
> single temporary or many different temporaries?
>
> def f():
> x = "outer"
> funcs = {}
> for i in range(10):
> if (g(i) as x) > 0:
> def closure():
> return x
> funcs[x] = closure
I think the rule should be either:
- statement-locals actually *are* locals and so behave like locals;
- statement-locals introduce a new scope, but still behave like
locals with respect to closures.
No need to introduce two separate modes of behaviour. (Or if there is
such a need, then the PEP should explain it.)
> > I think there's going to be a lot of confusion about which uses of "as"
> > bind to a new local and which don't.
>
> That's the exact point of "statement-local" though.
I don't think so. As I say:
> > I think this proposal is conflating two unrelated concepts:
> >
> > - introducing new variables in order to meet DRY requirements;
> >
> > - introducing a new scope.
If you're going to champion *both* concepts, then you need to justify
them both in the PEP, not just assume its obvious why we want both
together.
> > "SLNB"? Undefined acronym. What is it? I presume it has something to do
> > with the single-statement variable.
>
> Statement-Local Name Binding, from the title of the PEP. (But people
> probably don't read titles.)
Indeed. In case it isn't obvious, you should define the acronym the
first time you use it in the PEP.
> > This idea of local + single-statement names in the same expression
> > strikes me as similar. Having that same sort of thing happening within a
> > single statement gives me a headache:
> >
> > spam = (spam, ((spam + spam as spam) + spam as spam), spam)
> >
> > Explain that, if you will.
>
> Sure. First, eliminate all the name bindings:
[...]
The point is not that it cannot be explained, but that it requires
careful thought to understand. An advantage of using just regular locals
is that we don't have to think about the consequences of introducing two
new scopes. Its all happening to the same "a" variable.
> > Ah, how surprising -- given the tone of this PEP, I honestly thought
> > that it only applied to a single statement, not compound statements.
> >
> > You should mention this much earlier.
>
> Hmm. It's right up in the Rationale section, but without an example.
> Maybe an example would make it clearer?
Yes :-)
> >> * An SLNB cannot be the target of any form of assignment, including augmented.
> >> Attempting to do so will remove the SLNB and assign to the fully-scoped name.
> >
> > What's the justification for this limitation?
>
> Not having that limitation creates worse problems, like that having
> "(1 as a)" somewhere can suddenly make an assignment fail. This is
> particularly notable with loop headers rather than simple statements.
How and why would it fail?
> >> # See, for instance, Lib/pydoc.py
> >> if (re.search(pat, text) as match):
> >> print("Found:", match.group(0))
> >
> > I do not believe that is actually code found in Lib/pydoc.py, since that
> > will be a syntax error. What are you trying to say here?
>
> Lib/pydoc.py has a more complicated version of the exact same
> functionality. This would be a simplification of a common idiom that
> can be found in the stdlib and elsewhere.
Then the PEP should show a "Before" and "After".
> >> Performance costs
> >> =================
> >>
> >> The cost of SLNBs must be kept to a minimum, particularly when they are not
> >> used; the normal case MUST NOT be measurably penalized.
> >
> > What is the "normal case"?
>
> The case where you're not using any SLNBs.
The PEP should make this more clear:
"Any implementation must not include any significant performance cost to
code that does not use statement-locals."
> > It takes time, even if only a nanosecond, to bind a value to a
> > name, as opposed to *not* binding it to a name.
> >
> > x = (spam as eggs)
> >
> > has to be more expensive than
> >
> > x = spam
> >
> > because the first performs two name bindings rather than one. So "MUST
> > NOT" already implies this proposal *must* be rejected. Perhaps you mean
> > that there SHOULD NOT be a SIGNIFICANT performance penalty.
>
> The mere fact that this feature exists in the language MUST NOT
> measurably impact Python run-time performance.
MUST NOT implies that if there is *any* measurable penalty, even a
nano-second, the feature must be rejected. I think that's excessive.
Surely a nanosecond cost for the normal case is a reasonable tradeoff
if it buys us better expressiveness?
Beware of talking in absolutes unless you really mean them.
Besides, as soon as you talk performance, the question has to be, which
implementation?
Of course we don't want to necessarily impose unreasonable performance
and maintence costs on any implementation. But surely performance
cost is a quality of implementation issue. It ought to be a matter of
trade-offs: is the benefit sufficient to make up for the cost?
--
Steve
More information about the Python-ideas
mailing list