[Python-ideas] PEP 572: Statement-Local Name Bindings, take three!
Steven D'Aprano
steve at pearwood.info
Fri Mar 23 11:00:58 EDT 2018
On Fri, Mar 23, 2018 at 09:01:01PM +1100, Chris Angelico wrote:
> PEP: 572
> Title: Syntax for Statement-Local Name Bindings
[...]
> Abstract
> ========
>
> Programming is all about reusing code rather than duplicating it.
I don't think that editorial comment belongs here, or at least, it is
way too strong. I'm pretty sure that programming is not ALL about reusing code,
and code duplication is not always wrong.
Rather, we can say that *often* we want to avoid code duplication, and
this proposal is way way to do so. And this should go into the
Rationale, not the Abstract. The abstract should describe what this
proposal *does*, not why, for example:
This is a proposal for permitting temporary name bindings
which are limited to a single statement.
What the proposal *is* goes in the Abstract; reasons *why* we want it
go in the Rationale.
I see you haven't mentioned anything about Nick Coglan's (long ago)
concept of a "where" block. If memory serves, it would be something
like:
value = x**2 + 2*x where:
x = some expression
These are not necessarily competing, but they are relevant.
Nor have you done a review of any other languages, to see what similar
features they already offer. Not even the C's form of "assignment as an
expression" -- you should refer to that, and explain why this would not
similarly be a bug magnet.
> Rationale
> =========
>
> When a subexpression is used multiple times in a list comprehension,
I think that list comps are merely a single concrete example of a more
general concept that we sometimes want or need to apply the DRY
principle to a single expression.
This is (usually) a violation of DRY whether it is inside or outside of
a list comp:
result = (func(x), func(x)+1, func(x)*2)
> Syntax and semantics
> ====================
>
> In any context where arbitrary Python expressions can be used, a **named
> expression** can appear. This must be parenthesized for clarity, and is of
> the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
> and ``NAME`` is a simple name.
>
> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that NAME is bound to that
> value for the remainder of the current statement.
Examples should go with the description. Such as:
x = None if (spam().ham as eggs) is None else eggs
y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
> Just as function-local names shadow global names for the scope of the
> function, statement-local names shadow other names for that statement.
> (They can technically also shadow each other, though actually doing this
> should not be encouraged.)
That seems weird.
> Assignment to statement-local names is ONLY through this syntax. Regular
> assignment to the same name will remove the statement-local name and
> affect the name in the surrounding scope (function, class, or module).
That seems unnecessary. Since the scope only applies to a single
statement, not a block, there can be no other assignment to that name.
Correction: I see further in that this isn't the case. But that's deeply
confusing, to have the same name refer to two (or more!) scopes in the
same block. I think that's going to lead to some really confusing
scoping problems.
> Statement-local names never appear in locals() or globals(), and cannot be
> closed over by nested functions.
Why can they not be used in closures? I expect that's going to cause a
lot of frustration.
> Execution order and its consequences
> ------------------------------------
>
> Since the statement-local name binding lasts from its point of execution
> to the end of the current statement, this can potentially cause confusion
> when the actual order of execution does not match the programmer's
> expectations. Some examples::
>
> # A simple statement ends at the newline or semicolon.
> a = (1 as y)
> print(y) # NameError
That error surprises me. Every other use of "as" binds to the
current local namespace. (Or global, if you use the global
declaration first.)
I think there's going to be a lot of confusion about which uses of "as"
bind to a new local and which don't.
I think this proposal is conflating two unrelated concepts:
- introducing new variables in order to meet DRY requirements;
- introducing a new scope.
Why can't we do the first without the second?
a = (1 as y)
print(y) # prints 1, as other uses of "as" would do
That would avoid the unnecessary (IMO) restriction that these variables
cannot be used in closures.
> # The assignment ignores the SLNB - this adds one to 'a'
> a = (a + 1 as a)
"SLNB"? Undefined acronym. What is it? I presume it has something to do
with the single-statement variable.
I know it would be legal, but why would you write something like that?
Surely your examples must at least have a pretence of being useful (even
if the examples are only toy examples rather than realistic).
I think that having "a" be both local and single-statement in the same
expression is an awful idea. Lua has the (mis-)features that
variables are global by default, locals need to be declared, and the
same variable name can refer to both local and global simultaneously.
Thus we have:
print(x) # prints the global x
local x = x + 1 # sets local x to the global x plus 1
print(x) # prints the local x
https://www.lua.org/pil/4.2.html
This idea of local + single-statement names in the same expression
strikes me as similar. Having that same sort of thing happening within a
single statement gives me a headache:
spam = (spam, ((spam + spam as spam) + spam as spam), spam)
Explain that, if you will.
> # Compound statements usually enclose everything...
> if (re.match(...) as m):
> print(m.groups(0))
> print(m) # NameError
Ah, how surprising -- given the tone of this PEP, I honestly thought
that it only applied to a single statement, not compound statements.
You should mention this much earlier.
> # ... except when function bodies are involved...
> if (input("> ") as cmd):
> def run_cmd():
> print("Running command", cmd) # NameError
Such a special case is a violation of the Principle of Least Surprise.
> # ... but function *headers* are executed immediately
> if (input("> ") as cmd):
> def run_cmd(cmd=cmd): # Capture the value in the default arg
> print("Running command", cmd) # Works
>
> Function bodies, in this respect, behave the same way they do in class scope;
> assigned names are not closed over by method definitions. Defining a function
> inside a loop already has potentially-confusing consequences, and SLNBs do not
> materially worsen the existing situation.
Except by adding more complications to make it even harder to
understand the scoping rules.
> Differences from regular assignment statements
> ----------------------------------------------
>
> Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of
> important distinctions.
>
> * Assignment is a statement; an SLNB is an expression whose value is the same
> as the object bound to the new name.
> * SLNBs disappear at the end of their enclosing statement, at which point the
> name again refers to whatever it previously would have. SLNBs can thus
> shadow other names without conflict (although deliberately doing so will
> often be a sign of bad code).
Why choose this design over binding to a local variable? What benefit is
there to using yet another scope?
> * SLNBs cannot be closed over by nested functions, and are completely ignored
> for this purpose.
What's the justification for this limitation?
> * SLNBs do not appear in ``locals()`` or ``globals()``.
That is like non-locals, so I suppose that's not unprecedented.
Will there be a function slnbs() to retrieve these?
> * An SLNB cannot be the target of any form of assignment, including augmented.
> Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
> Example usage
> =============
>
> These list comprehensions are all approximately equivalent::
[...]
I don't think you need to give an exhaustive list of every way to write
a list comp. List comps are only a single use-case for this feature.
> # See, for instance, Lib/pydoc.py
> if (re.search(pat, text) as match):
> print("Found:", match.group(0))
I do not believe that is actually code found in Lib/pydoc.py, since that
will be a syntax error. What are you trying to say here?
> while (sock.read() as data):
> print("Received data:", data)
Looking at that example, I wonder why we need to include the parens when
there is no ambiguity.
# okay
while sock.read() as data:
print("Received data:", data)
# needs parentheses
while (spam.method() as eggs) is None or eggs.count() < 100:
print("something")
> Performance costs
> =================
>
> The cost of SLNBs must be kept to a minimum, particularly when they are not
> used; the normal case MUST NOT be measurably penalized.
What is the "normal case"?
It takes time, even if only a nanosecond, to bind a value to a
name, as opposed to *not* binding it to a name.
x = (spam as eggs)
has to be more expensive than
x = spam
because the first performs two name bindings rather than one. So "MUST
NOT" already implies this proposal *must* be rejected. Perhaps you mean
that there SHOULD NOT be a SIGNIFICANT performance penalty.
> SLNBs are expected to be uncommon,
On what basis do you expect this?
Me, I'm cynical about my fellow coders, because I've worked with them
and read their code *wink* and I expect they'll use this everywhere
"just in case" and "to avoid namespace pollution".
But putting aside such (potential) abuse of the feature, I think you're
under-cutting your own proposal.
If this is really going to be uncommon, why bother complicating the
language with a whole extra scope that hardly anyone is going to use but
will be cryptic and mysterious on the rare occasion that they bump into
it? Especially using a keyword that is already used elsewhere: "import
as", "with as" and "except as" are going to dominate the search results.
If this really will be uncommon, it's not worth it, but I don't think it
would be uncommon. For good or ill, I think people will use this.
Besides, I think that the while loop example is a really nice one. I'd
use that, I think. I *almost* think that it alone justifies the
exercise.
> and using many of them in a single function should definitely
> be discouraged.
Do you mean a single statement? I don't see why it should be discouraged
from using this many times in a single function.
> Forbidden special cases
> =======================
>
> In two situations, the use of SLNBs makes no sense, and could be confusing due
> to the ``as`` keyword already having a different meaning in the same context.
I'm pretty sure there are many more than just two situations where the
use of this makes no sense. Many of your examples perform an unnecessary
name binding that is then never used. I think that's going to encourage
programmers to do the same, especially when they read this PEP and think
your examples are "Best Practice".
Besides, in principle they could be useful (at least in contrived
examples). Emember that exceptions are not necessarily constants. They
can be computed at runtime:
try:
...
except (Errors[key], spam(Errors[key]):
...
Since we have a DRY-violation in Errors[key] twice, it is conceivable
that we could write:
try:
...
except ((Errors[key] as my_error), spam(my_error)):
...
Contrived? Sure. But I think it makes sense.
Perhaps a better argument is that it may be ambiguous with existing
syntax, in which case the ambiguous cases should be banned.
> Alternative proposals
> =====================
>
> Proposals broadly similar to this one have come up frequently on python-ideas.
> Below are a number of alternative syntaxes, some of them specific to
> comprehensions, which have been rejected in favour of the one given above.
>
> 1. ``where``, ``let``, ``given``::
>
> stuff = [(y, x/y) where y = f(x) for x in range(5)]
> stuff = [(y, x/y) let y = f(x) for x in range(5)]
> stuff = [(y, x/y) given y = f(x) for x in range(5)]
>
> This brings the subexpression to a location in between the 'for' loop and
> the expression. It introduces an additional language keyword, which creates
> conflicts. Of the three, ``where`` reads the most cleanly, but also has the
> greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
> methods, as does ``tkinter.dnd.Icon`` in the standard library).
>
> 2. ``with NAME = EXPR``::
>
> stuff = [(y, x/y) with y = f(x) for x in range(5)]
This is the same proposal as above, just using a different keyword.
> 3. ``with EXPR as NAME``::
>
> stuff = [(y, x/y) with f(x) as y for x in range(5)]
Again, this isn't an alternative proposal, this is the same as 1. above
just with different syntax. Likewise for 4. and 5.
So you don't really have five different proposals, but only 1, with
slight variations of syntax or semantics. They should be grouped
together.
"We have five different lunches available. Spam, spam and spam, spam
deluxe, spam with eggs and spam, and chicken surprise."
"What's the chicken surprise?"
"It's actually made of spam."
> 6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
And this would be a second proposal.
> This is exactly the same as the promoted proposal, save that the name is
> bound in the same scope that it would otherwise have. Any expression can
> assign to any name, just as it would if the ``=`` operator had been used.
> Such variables would leak out of the statement into the enclosing function,
> subject to the regular behaviour of comprehensions (since they implicitly
> create a nested function, the name binding would be restricted to the
> comprehension itself, just as with the names bound by ``for`` loops).
Indeed. Why are you rejecting this in favour of combining name-binding +
new scope into a single syntax?
--
Steve
More information about the Python-ideas
mailing list