PEP 572: Statement-Local Name Bindings, take three!
Apologies for letting this languish; life has an annoying habit of
getting in the way now and then.
Feedback from the previous rounds has been incorporated. From here,
the most important concern and question is: Is there any other syntax
or related proposal that ought to be mentioned here? If this proposal
is rejected, it should be rejected with a full set of alternatives.
Text of PEP is below; formatted version will be live shortly (if it
isn't already) at:
https://www.python.org/dev/peps/pep-0572/
ChrisA
PEP: 572
Title: Syntax for Statement-Local Name Bindings
Author: Chris Angelico
On 23 March 2018 at 10:01, Chris Angelico
# ... except when function bodies are involved... if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # NameError
# ... but function *headers* are executed immediately if (input("> ") as cmd): def run_cmd(cmd=cmd): # Capture the value in the default arg print("Running command", cmd) # Works
What about cmd = "Something else" if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # Closes over the "outer" cmd, not the statement-local one? Did I get that right? I don't really like it if so (I think it's confusing) but I guess I could live with "well, don't do that then" as an answer. And I don't have a better interpretation. I'm still not convinced I like the proposal, but it's a lot cleaner than previous versions, so thanks for that. Far fewer places where I said "hmm, I don't understand the implications". Paul
On 23/03/18 10:01, Chris Angelico wrote:
Apologies for letting this languish; life has an annoying habit of getting in the way now and then.
Feedback from the previous rounds has been incorporated. From here, the most important concern and question is: Is there any other syntax or related proposal that ought to be mentioned here? If this proposal is rejected, it should be rejected with a full set of alternatives.
Thank you very much, Chris. I think you've won me over on most points, though I'm not sure whether I'm overall +0 or -0 on the whole PEP :-) -- Rhodri James *-* Kynesim Ltd
On Fri, Mar 23, 2018 at 9:38 PM, Paul Moore
On 23 March 2018 at 10:01, Chris Angelico
wrote: # ... except when function bodies are involved... if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # NameError
# ... but function *headers* are executed immediately if (input("> ") as cmd): def run_cmd(cmd=cmd): # Capture the value in the default arg print("Running command", cmd) # Works
What about
cmd = "Something else" if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # Closes over the "outer" cmd, not the statement-local one?
Did I get that right? I don't really like it if so (I think it's confusing) but I guess I could live with "well, don't do that then" as an answer. And I don't have a better interpretation.
Yes, that would be it. And I agree: Don't do that. It's the same sort of confusion you'd get here: def f(): spam = 1 class C: spam = 2 def g(x=spam): print(spam) # prints 1 print(x) # prints 2 C.g() A class creates a scope that function bodies inside it don't close over, but their headers are still executed in that scope. So default argument values "see" those inner variables, but the body of the function doesn't. It's the same with SLNBs.
I'm still not convinced I like the proposal, but it's a lot cleaner than previous versions, so thanks for that. Far fewer places where I said "hmm, I don't understand the implications".
Cool, thanks. That's the idea here. ChrisA
On Fri, Mar 23, 2018 at 09:01:01PM +1100, Chris Angelico wrote:
PEP: 572 Title: Syntax for Statement-Local Name Bindings [...] Abstract ========
Programming is all about reusing code rather than duplicating it.
I don't think that editorial comment belongs here, or at least, it is way too strong. I'm pretty sure that programming is not ALL about reusing code, and code duplication is not always wrong. Rather, we can say that *often* we want to avoid code duplication, and this proposal is way way to do so. And this should go into the Rationale, not the Abstract. The abstract should describe what this proposal *does*, not why, for example: This is a proposal for permitting temporary name bindings which are limited to a single statement. What the proposal *is* goes in the Abstract; reasons *why* we want it go in the Rationale. I see you haven't mentioned anything about Nick Coglan's (long ago) concept of a "where" block. If memory serves, it would be something like: value = x**2 + 2*x where: x = some expression These are not necessarily competing, but they are relevant. Nor have you done a review of any other languages, to see what similar features they already offer. Not even the C's form of "assignment as an expression" -- you should refer to that, and explain why this would not similarly be a bug magnet.
Rationale =========
When a subexpression is used multiple times in a list comprehension,
I think that list comps are merely a single concrete example of a more general concept that we sometimes want or need to apply the DRY principle to a single expression. This is (usually) a violation of DRY whether it is inside or outside of a list comp: result = (func(x), func(x)+1, func(x)*2)
Syntax and semantics ====================
In any context where arbitrary Python expressions can be used, a **named expression** can appear. This must be parenthesized for clarity, and is of the form ``(expr as NAME)`` where ``expr`` is any valid Python expression, and ``NAME`` is a simple name.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that NAME is bound to that value for the remainder of the current statement.
Examples should go with the description. Such as: x = None if (spam().ham as eggs) is None else eggs y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. (They can technically also shadow each other, though actually doing this should not be encouraged.)
That seems weird.
Assignment to statement-local names is ONLY through this syntax. Regular assignment to the same name will remove the statement-local name and affect the name in the surrounding scope (function, class, or module).
That seems unnecessary. Since the scope only applies to a single statement, not a block, there can be no other assignment to that name. Correction: I see further in that this isn't the case. But that's deeply confusing, to have the same name refer to two (or more!) scopes in the same block. I think that's going to lead to some really confusing scoping problems.
Statement-local names never appear in locals() or globals(), and cannot be closed over by nested functions.
Why can they not be used in closures? I expect that's going to cause a lot of frustration.
Execution order and its consequences ------------------------------------
Since the statement-local name binding lasts from its point of execution to the end of the current statement, this can potentially cause confusion when the actual order of execution does not match the programmer's expectations. Some examples::
# A simple statement ends at the newline or semicolon. a = (1 as y) print(y) # NameError
That error surprises me. Every other use of "as" binds to the current local namespace. (Or global, if you use the global declaration first.) I think there's going to be a lot of confusion about which uses of "as" bind to a new local and which don't. I think this proposal is conflating two unrelated concepts: - introducing new variables in order to meet DRY requirements; - introducing a new scope. Why can't we do the first without the second? a = (1 as y) print(y) # prints 1, as other uses of "as" would do That would avoid the unnecessary (IMO) restriction that these variables cannot be used in closures.
# The assignment ignores the SLNB - this adds one to 'a' a = (a + 1 as a)
"SLNB"? Undefined acronym. What is it? I presume it has something to do with the single-statement variable. I know it would be legal, but why would you write something like that? Surely your examples must at least have a pretence of being useful (even if the examples are only toy examples rather than realistic). I think that having "a" be both local and single-statement in the same expression is an awful idea. Lua has the (mis-)features that variables are global by default, locals need to be declared, and the same variable name can refer to both local and global simultaneously. Thus we have: print(x) # prints the global x local x = x + 1 # sets local x to the global x plus 1 print(x) # prints the local x https://www.lua.org/pil/4.2.html This idea of local + single-statement names in the same expression strikes me as similar. Having that same sort of thing happening within a single statement gives me a headache: spam = (spam, ((spam + spam as spam) + spam as spam), spam) Explain that, if you will.
# Compound statements usually enclose everything... if (re.match(...) as m): print(m.groups(0)) print(m) # NameError
Ah, how surprising -- given the tone of this PEP, I honestly thought that it only applied to a single statement, not compound statements. You should mention this much earlier.
# ... except when function bodies are involved... if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # NameError
Such a special case is a violation of the Principle of Least Surprise.
# ... but function *headers* are executed immediately if (input("> ") as cmd): def run_cmd(cmd=cmd): # Capture the value in the default arg print("Running command", cmd) # Works
Function bodies, in this respect, behave the same way they do in class scope; assigned names are not closed over by method definitions. Defining a function inside a loop already has potentially-confusing consequences, and SLNBs do not materially worsen the existing situation.
Except by adding more complications to make it even harder to understand the scoping rules.
Differences from regular assignment statements ----------------------------------------------
Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of important distinctions.
* Assignment is a statement; an SLNB is an expression whose value is the same as the object bound to the new name. * SLNBs disappear at the end of their enclosing statement, at which point the name again refers to whatever it previously would have. SLNBs can thus shadow other names without conflict (although deliberately doing so will often be a sign of bad code).
Why choose this design over binding to a local variable? What benefit is there to using yet another scope?
* SLNBs cannot be closed over by nested functions, and are completely ignored for this purpose.
What's the justification for this limitation?
* SLNBs do not appear in ``locals()`` or ``globals()``.
That is like non-locals, so I suppose that's not unprecedented. Will there be a function slnbs() to retrieve these?
* An SLNB cannot be the target of any form of assignment, including augmented. Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
Example usage =============
These list comprehensions are all approximately equivalent:: [...]
I don't think you need to give an exhaustive list of every way to write a list comp. List comps are only a single use-case for this feature.
# See, for instance, Lib/pydoc.py if (re.search(pat, text) as match): print("Found:", match.group(0))
I do not believe that is actually code found in Lib/pydoc.py, since that will be a syntax error. What are you trying to say here?
while (sock.read() as data): print("Received data:", data)
Looking at that example, I wonder why we need to include the parens when there is no ambiguity. # okay while sock.read() as data: print("Received data:", data) # needs parentheses while (spam.method() as eggs) is None or eggs.count() < 100: print("something")
Performance costs =================
The cost of SLNBs must be kept to a minimum, particularly when they are not used; the normal case MUST NOT be measurably penalized.
What is the "normal case"? It takes time, even if only a nanosecond, to bind a value to a name, as opposed to *not* binding it to a name. x = (spam as eggs) has to be more expensive than x = spam because the first performs two name bindings rather than one. So "MUST NOT" already implies this proposal *must* be rejected. Perhaps you mean that there SHOULD NOT be a SIGNIFICANT performance penalty.
SLNBs are expected to be uncommon,
On what basis do you expect this? Me, I'm cynical about my fellow coders, because I've worked with them and read their code *wink* and I expect they'll use this everywhere "just in case" and "to avoid namespace pollution". But putting aside such (potential) abuse of the feature, I think you're under-cutting your own proposal. If this is really going to be uncommon, why bother complicating the language with a whole extra scope that hardly anyone is going to use but will be cryptic and mysterious on the rare occasion that they bump into it? Especially using a keyword that is already used elsewhere: "import as", "with as" and "except as" are going to dominate the search results. If this really will be uncommon, it's not worth it, but I don't think it would be uncommon. For good or ill, I think people will use this. Besides, I think that the while loop example is a really nice one. I'd use that, I think. I *almost* think that it alone justifies the exercise.
and using many of them in a single function should definitely be discouraged.
Do you mean a single statement? I don't see why it should be discouraged from using this many times in a single function.
Forbidden special cases =======================
In two situations, the use of SLNBs makes no sense, and could be confusing due to the ``as`` keyword already having a different meaning in the same context.
I'm pretty sure there are many more than just two situations where the use of this makes no sense. Many of your examples perform an unnecessary name binding that is then never used. I think that's going to encourage programmers to do the same, especially when they read this PEP and think your examples are "Best Practice". Besides, in principle they could be useful (at least in contrived examples). Emember that exceptions are not necessarily constants. They can be computed at runtime: try: ... except (Errors[key], spam(Errors[key]): ... Since we have a DRY-violation in Errors[key] twice, it is conceivable that we could write: try: ... except ((Errors[key] as my_error), spam(my_error)): ... Contrived? Sure. But I think it makes sense. Perhaps a better argument is that it may be ambiguous with existing syntax, in which case the ambiguous cases should be banned.
Alternative proposals =====================
Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.
1. ``where``, ``let``, ``given``::
stuff = [(y, x/y) where y = f(x) for x in range(5)] stuff = [(y, x/y) let y = f(x) for x in range(5)] stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop and the expression. It introduces an additional language keyword, which creates conflicts. Of the three, ``where`` reads the most cleanly, but also has the greatest potential for conflict (eg SQLAlchemy and numpy have ``where`` methods, as does ``tkinter.dnd.Icon`` in the standard library).
2. ``with NAME = EXPR``::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
This is the same proposal as above, just using a different keyword.
3. ``with EXPR as NAME``::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
Again, this isn't an alternative proposal, this is the same as 1. above just with different syntax. Likewise for 4. and 5. So you don't really have five different proposals, but only 1, with slight variations of syntax or semantics. They should be grouped together. "We have five different lunches available. Spam, spam and spam, spam deluxe, spam with eggs and spam, and chicken surprise." "What's the chicken surprise?" "It's actually made of spam."
6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
And this would be a second proposal.
This is exactly the same as the promoted proposal, save that the name is bound in the same scope that it would otherwise have. Any expression can assign to any name, just as it would if the ``=`` operator had been used. Such variables would leak out of the statement into the enclosing function, subject to the regular behaviour of comprehensions (since they implicitly create a nested function, the name binding would be restricted to the comprehension itself, just as with the names bound by ``for`` loops).
Indeed. Why are you rejecting this in favour of combining name-binding + new scope into a single syntax? -- Steve
On Sat, Mar 24, 2018 at 2:00 AM, Steven D'Aprano
On Fri, Mar 23, 2018 at 09:01:01PM +1100, Chris Angelico wrote:
PEP: 572 Title: Syntax for Statement-Local Name Bindings [...] Abstract ========
Programming is all about reusing code rather than duplicating it.
I don't think that editorial comment belongs here, or at least, it is way too strong. I'm pretty sure that programming is not ALL about reusing code, and code duplication is not always wrong.
Rather, we can say that *often* we want to avoid code duplication, and this proposal is way way to do so. And this should go into the Rationale, not the Abstract. The abstract should describe what this proposal *does*, not why, for example:
This is a proposal for permitting temporary name bindings which are limited to a single statement.
What the proposal *is* goes in the Abstract; reasons *why* we want it go in the Rationale.
Thanks. I've never really been happy with my "Abstract" / "Rationale" split, as they're two sections both designed to give that initial 'sell', and I'm clearly not good at writing the distinction :) Unless you object, I'm just going to steal your Abstract wholesale. Seems like some good words there.
I see you haven't mentioned anything about Nick Coglan's (long ago) concept of a "where" block. If memory serves, it would be something like:
value = x**2 + 2*x where: x = some expression
These are not necessarily competing, but they are relevant.
Definitely relevant, thanks. This is exactly what I'm looking for - related proposals that got lost in the lengthy threads on the subject. I'll mention it as another proposal, but if anyone has an actual post for me to reference, that would be appreciated (just to make sure I'm correctly representing it).
Nor have you done a review of any other languages, to see what similar features they already offer. Not even the C's form of "assignment as an expression" -- you should refer to that, and explain why this would not similarly be a bug magnet.
No, I haven't yet. Sounds like a new section is needed. Thing is, there's a HUGE family of C-like and C-inspired languages that allow assignment expressions, and for the rest, I don't have any personal experience. So I need input from people: what languages do you know of that have small-scope name bindings like this?
Rationale =========
When a subexpression is used multiple times in a list comprehension,
I think that list comps are merely a single concrete example of a more general concept that we sometimes want or need to apply the DRY principle to a single expression.
This is (usually) a violation of DRY whether it is inside or outside of a list comp:
result = (func(x), func(x)+1, func(x)*2)
True, but outside of comprehensions, the most obvious response is "just add another assignment statement". You can't do that in a list comp (or equivalently in a genexp or dict comp). Syntactically you're right that they're just one example of a general concept; but they're one of the original motivating reasons. I've tweaked the rationale wording some; the idea is now "here's a general idea" followed by two paragraphs of specific use-cases (comprehensions and loops). Let me know if that works better.
Syntax and semantics ====================
In any context where arbitrary Python expressions can be used, a **named expression** can appear. This must be parenthesized for clarity, and is of the form ``(expr as NAME)`` where ``expr`` is any valid Python expression, and ``NAME`` is a simple name.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that NAME is bound to that value for the remainder of the current statement.
Examples should go with the description. Such as:
x = None if (spam().ham as eggs) is None else eggs
Not sure what you gain out of that :) Maybe a different first expression would help.
y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
Sure. I may need to get some simpler examples to kick things off though.
Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. (They can technically also shadow each other, though actually doing this should not be encouraged.)
That seems weird.
Which part? That they shadow, or that they can shadow each other? Shadowing is the same as nested functions (including comprehensions, since they're implemented with functions); and if SLNBs are *not* to shadow each other, the only way is to straight-up disallow it. For the moment, I'm not forbidding it, as there's no particular advantage to popping a SyntaxError.
Assignment to statement-local names is ONLY through this syntax. Regular assignment to the same name will remove the statement-local name and affect the name in the surrounding scope (function, class, or module).
That seems unnecessary. Since the scope only applies to a single statement, not a block, there can be no other assignment to that name.
Correction: I see further in that this isn't the case. But that's deeply confusing, to have the same name refer to two (or more!) scopes in the same block. I think that's going to lead to some really confusing scoping problems.
def f(): ... e = 2.71828 ... try: ... 1/0 ... except Exception as e: ... print(e) ... print(e) ... f()
For the current proposal, I prefer simpler definitions to outlawing the odd options. The rule is: An SLNB exists from the moment it's created to the end of that statement. Very simple, very straight-forward. Yes, that means you could use the same name earlier in the statement, but ideally, you just wouldn't do that. Python already has weirder behaviour in it. division by zero Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 7, in f UnboundLocalError: local variable 'e' referenced before assignment Does this often cause problems? No, because most functions don't use the same name in two different ways. An SLNB should be basically the same.
Statement-local names never appear in locals() or globals(), and cannot be closed over by nested functions.
Why can they not be used in closures? I expect that's going to cause a lot of frustration.
Conceptually, the variable stops existing at the end of that statement. It makes for some oddities, but fewer oddities than every other variant that I toyed with. For example, does this create one single temporary or many different temporaries? def f(): x = "outer" funcs = {} for i in range(10): if (g(i) as x) > 0: def closure(): return x funcs[x] = closure Obviously the 'x' in funcs[x] is the current version of x as it runs through the loop. But what about the one closed over? If regular assignment is used ("x = g(i)"), the last value of x will be seen by every function. With a statement-local variable, should it be a single temporary all through the loop, or should each iteration create a brand new "slot" that gets closed over? If the latter, why is it different from regular assignment, and how would it be implemented anyway? Do we now need an infinite number of closure cells that all have the exact same name?
Execution order and its consequences ------------------------------------
Since the statement-local name binding lasts from its point of execution to the end of the current statement, this can potentially cause confusion when the actual order of execution does not match the programmer's expectations. Some examples::
# A simple statement ends at the newline or semicolon. a = (1 as y) print(y) # NameError
That error surprises me. Every other use of "as" binds to the current local namespace. (Or global, if you use the global declaration first.)
I think there's going to be a lot of confusion about which uses of "as" bind to a new local and which don't.
That's the exact point of "statement-local" though.
I think this proposal is conflating two unrelated concepts:
- introducing new variables in order to meet DRY requirements;
- introducing a new scope.
Why can't we do the first without the second?
a = (1 as y) print(y) # prints 1, as other uses of "as" would do
That would avoid the unnecessary (IMO) restriction that these variables cannot be used in closures.
You're talking about one of the alternate proposals there. (#6, currently.) I have talked about the possibility of splitting this into two separate proposals, but then I'd have to try to chair two separate concurrent discussions that would constantly interact and cross over :)
# The assignment ignores the SLNB - this adds one to 'a' a = (a + 1 as a)
"SLNB"? Undefined acronym. What is it? I presume it has something to do with the single-statement variable.
Statement-Local Name Binding, from the title of the PEP. (But people probably don't read titles.)
I know it would be legal, but why would you write something like that? Surely your examples must at least have a pretence of being useful (even if the examples are only toy examples rather than realistic).
That section is about the edge cases, and one such edge case is assigning through an SLNB.
I think that having "a" be both local and single-statement in the same expression is an awful idea. Lua has the (mis-)features that variables are global by default, locals need to be declared, and the same variable name can refer to both local and global simultaneously. Thus we have:
print(x) # prints the global x local x = x + 1 # sets local x to the global x plus 1 print(x) # prints the local x
IMO that's a *good* thing. JavaScript works the other way; either you say "var x = x + 1;" and the variable exists for the whole function, pre-initialized to the special value 'undefined', or you say "let x = x + 1;" and the variable is in limbo until you hit that statement, causing a ReferenceError (JS's version of NameError). Neither makes as much sense as evaluating the initializer before the variable starts to exist. That said, though, this is STILL an edge case. It's giving a somewhat-sane meaning to something you normally won't do.
This idea of local + single-statement names in the same expression strikes me as similar. Having that same sort of thing happening within a single statement gives me a headache:
spam = (spam, ((spam + spam as spam) + spam as spam), spam)
Explain that, if you will.
Sure. First, eliminate all the name bindings: spam = (spam, ((spam + spam) + spam), spam) Okay. Now anyone with basic understanding of algebra can figure out the execution order. Then every time you have a construct with an 'as', you change the value of 'spam' from that point on. Which means we have: spam0 = (spam0, ((spam0 + spam0 as spam1) + spam1 as spam2), spam2) Execution order is strictly left-to-right here, so it's pretty straight-forward. Less clear if you have an if/else expression (since they're executed middle-out instead of left-to-right), but SLNBs are just like any other side effects in an expression, performed in a well-defined order. And just like with side effects, you don't want to have complex interactions between them, but there's nothing illegal in it.
# Compound statements usually enclose everything... if (re.match(...) as m): print(m.groups(0)) print(m) # NameError
Ah, how surprising -- given the tone of this PEP, I honestly thought that it only applied to a single statement, not compound statements.
You should mention this much earlier.
Hmm. It's right up in the Rationale section, but without an example. Maybe an example would make it clearer?
# ... except when function bodies are involved... if (input("> ") as cmd): def run_cmd(): print("Running command", cmd) # NameError
Such a special case is a violation of the Principle of Least Surprise.
Blame classes, which already do this. Exactly this. Being able to close over temporaries creates its own problems.
# ... but function *headers* are executed immediately if (input("> ") as cmd): def run_cmd(cmd=cmd): # Capture the value in the default arg print("Running command", cmd) # Works
Function bodies, in this respect, behave the same way they do in class scope; assigned names are not closed over by method definitions. Defining a function inside a loop already has potentially-confusing consequences, and SLNBs do not materially worsen the existing situation.
Except by adding more complications to make it even harder to understand the scoping rules.
Except that I'm adding no complications. This is just the consequences of Python's *existing* scoping rules.
Differences from regular assignment statements ----------------------------------------------
Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of important distinctions.
* Assignment is a statement; an SLNB is an expression whose value is the same as the object bound to the new name. * SLNBs disappear at the end of their enclosing statement, at which point the name again refers to whatever it previously would have. SLNBs can thus shadow other names without conflict (although deliberately doing so will often be a sign of bad code).
Why choose this design over binding to a local variable? What benefit is there to using yet another scope?
Mainly, I just know that there has been a lot of backlash against a generic "assignment as expression" syntax in the past.
* SLNBs do not appear in ``locals()`` or ``globals()``.
That is like non-locals, so I suppose that's not unprecedented.
Will there be a function slnbs() to retrieve these?
Not in the current proposal, no. Originally, I planned for them to appear in locals() while they were in scope, but that created its own problems; I'd be happy to return to that proposal if it were worthwhile.
* An SLNB cannot be the target of any form of assignment, including augmented. Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
Not having that limitation creates worse problems, like that having "(1 as a)" somewhere can suddenly make an assignment fail. This is particularly notable with loop headers rather than simple statements.
Example usage =============
These list comprehensions are all approximately equivalent:: [...]
I don't think you need to give an exhaustive list of every way to write a list comp. List comps are only a single use-case for this feature.
# See, for instance, Lib/pydoc.py if (re.search(pat, text) as match): print("Found:", match.group(0))
I do not believe that is actually code found in Lib/pydoc.py, since that will be a syntax error. What are you trying to say here?
Lib/pydoc.py has a more complicated version of the exact same functionality. This would be a simplification of a common idiom that can be found in the stdlib and elsewhere.
while (sock.read() as data): print("Received data:", data)
Looking at that example, I wonder why we need to include the parens when there is no ambiguity.
# okay while sock.read() as data: print("Received data:", data)
# needs parentheses while (spam.method() as eggs) is None or eggs.count() < 100: print("something")
I agree, but starting with them mandatory allows for future relaxation of requirements. The changes to the grammar are less intrusive if the parens are always required (for instance, the special case "f(x for x in y)" has its own entry in the grammar).
Performance costs =================
The cost of SLNBs must be kept to a minimum, particularly when they are not used; the normal case MUST NOT be measurably penalized.
What is the "normal case"?
The case where you're not using any SLNBs.
It takes time, even if only a nanosecond, to bind a value to a name, as opposed to *not* binding it to a name.
x = (spam as eggs)
has to be more expensive than
x = spam
because the first performs two name bindings rather than one. So "MUST NOT" already implies this proposal *must* be rejected. Perhaps you mean that there SHOULD NOT be a SIGNIFICANT performance penalty.
The mere fact that this feature exists in the language MUST NOT measurably impact Python run-time performance.
SLNBs are expected to be uncommon,
On what basis do you expect this?
Me, I'm cynical about my fellow coders, because I've worked with them and read their code *wink* and I expect they'll use this everywhere "just in case" and "to avoid namespace pollution".
Compared to regular name bindings? Just look at the number of ways to assign that are NOT statement-local, and then add in the fact that SLNBs aren't going to be effective for anything that you need to mutate more than once, and I fully expect that regular name bindings will far exceed SLNBs.
Besides, I think that the while loop example is a really nice one. I'd use that, I think. I *almost* think that it alone justifies the exercise.
Hmm, okay. I'll work on rewording that section later.
Forbidden special cases =======================
In two situations, the use of SLNBs makes no sense, and could be confusing due to the ``as`` keyword already having a different meaning in the same context.
I'm pretty sure there are many more than just two situations where the use of this makes no sense. Many of your examples perform an unnecessary name binding that is then never used. I think that's going to encourage programmers to do the same, especially when they read this PEP and think your examples are "Best Practice".
Unnecessary, yes, but not downright problematic. The two specific cases mentioned are (a) evaluating expressions, and (b) using the 'as' keyword in a way that's incompatible with PEP 572. (There's no confusion in "import x as y", for instance, because "x" is not an expression.)
Besides, in principle they could be useful (at least in contrived examples). Emember that exceptions are not necessarily constants. They can be computed at runtime:
try: ... except (Errors[key], spam(Errors[key]): ...
Sure they *can*. Have you ever seen something like that in production? I've seen simple examples (eg having a tuple of exception types that you care about, and that tuple not always being constant), but nothing where you could ever want an SLNB.
Since we have a DRY-violation in Errors[key] twice, it is conceivable that we could write:
try: ... except ((Errors[key] as my_error), spam(my_error)): ...
Contrived? Sure. But I think it makes sense.
Perhaps a better argument is that it may be ambiguous with existing syntax, in which case the ambiguous cases should be banned.
It's not *technically* ambiguous, because PEP 572 demands parentheses and both 'except' and 'with' statements forbid parentheses. The compiler can, with 100% accuracy, pick between the two alternatives. But having "except X as Y:" mean something drastically different from "except (X as Y):" is confusing *to humans*.
2. ``with NAME = EXPR``::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
This is the same proposal as above, just using a different keyword.
Yep. I've changed the heading to "Alternative proposals and variants" as some of them are merely variations on each other. They're given separate entries because I have separate commentary about them.
6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
And this would be a second proposal.
This is exactly the same as the promoted proposal, save that the name is bound in the same scope that it would otherwise have. Any expression can assign to any name, just as it would if the ``=`` operator had been used. Such variables would leak out of the statement into the enclosing function, subject to the regular behaviour of comprehensions (since they implicitly create a nested function, the name binding would be restricted to the comprehension itself, just as with the names bound by ``for`` loops).
Indeed. Why are you rejecting this in favour of combining name-binding + new scope into a single syntax?
Mainly because there's been a lot of backlash against regular assignment inside expressions. One thing I *have* learned from life is that you can't make everyone happy. Sometimes, "why isn't your proposal X instead of Y" is just "well, X is a valid proposal too, so you can go ahead and push for that one if you like". :) I had to pick something, and I picked that one. ChrisA
To keep this a manageable length, I've trimmed vigourously. Apologies in advance if I've been too enthusiastic with the trimming :-) On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote:
No, I haven't yet. Sounds like a new section is needed. Thing is, there's a HUGE family of C-like and C-inspired languages that allow assignment expressions, and for the rest, I don't have any personal experience. So I need input from people: what languages do you know of that have small-scope name bindings like this?
I don't know if this counts as "like this", but Lua has a do...end block that introduces a new scope. Something like this: x = 1 do: x = 2 print(x) # prints 2 end print(x) # prints 1 I think that's a neat concept, but I'm struggling to think what I would use it for. [...]
result = (func(x), func(x)+1, func(x)*2)
True, but outside of comprehensions, the most obvious response is "just add another assignment statement". You can't do that in a list comp (or equivalently in a genexp or dict comp).
Yes you can: your PEP gives equivalents that work fine for list comps, starting with factorising the duplicate code out into a helper function, to using a separate loop to get assignment: [(spam, spam+1) for x in values for spam in (func(x),)] [(spam, spam+1) for spam in (func(x) for x in values)] They are the equivalent to "just add another assignment statement" for comprehensions. I acknowledge that comprehensions are the motivating example here, but I don't think they're the only justification for the concept. Strictly speaking, there's never a time that we cannot use a new assignment statement. But sometimes it is annoying or inconvenient. Consider a contrived example: TABLE = [ alpha, beta, gamma, delta, ... func(omega) + func(omega)**2 + func(omega)**3, ] Yes, I can pull out the duplication: temp = function(omega) TABLE = [ alpha, beta, gamma, delta, ... temp + temp**2 + temp**3, ] but that puts the definition of temp quite distant from its use. So this is arguably nicer: TABLE = [ alpha, beta, gamma, delta, ... (func(omega) as temp) + temp**2 + temp**3, ]
Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. (They can technically also shadow each other, though actually doing this should not be encouraged.)
That seems weird.
Which part? That they shadow, or that they can shadow each other?
Shadowing themselves. I'm still not convinced these should just shadow local variables. Of course locals will shadow nonlocals, which shadow globals, which shadow builtins. I'm just not sure that we gain much (enough?) to justify adding a new scope between what we already have: proposed statement-local local nonlocal class (only during class statement) global builtins I think that needs justification by more than just "it makes the implementation easier".
Shadowing is the same as nested functions (including comprehensions, since they're implemented with functions); and if SLNBs are *not* to shadow each other, the only way is to straight-up disallow it.
Or they can just rebind to the same (statement-)local. E.g.: while ("spam" as x): assert x == "spam" while ("eggs" as x): assert x == "eggs" break assert x == "eggs"
Why can they not be used in closures? I expect that's going to cause a lot of frustration.
Conceptually, the variable stops existing at the end of that statement. It makes for some oddities, but fewer oddities than every other variant that I toyed with. For example, does this create one single temporary or many different temporaries?
def f(): x = "outer" funcs = {} for i in range(10): if (g(i) as x) > 0: def closure(): return x funcs[x] = closure
I think the rule should be either: - statement-locals actually *are* locals and so behave like locals; - statement-locals introduce a new scope, but still behave like locals with respect to closures. No need to introduce two separate modes of behaviour. (Or if there is such a need, then the PEP should explain it.)
I think there's going to be a lot of confusion about which uses of "as" bind to a new local and which don't.
That's the exact point of "statement-local" though.
I don't think so. As I say:
I think this proposal is conflating two unrelated concepts:
- introducing new variables in order to meet DRY requirements;
- introducing a new scope.
If you're going to champion *both* concepts, then you need to justify them both in the PEP, not just assume its obvious why we want both together.
"SLNB"? Undefined acronym. What is it? I presume it has something to do with the single-statement variable.
Statement-Local Name Binding, from the title of the PEP. (But people probably don't read titles.)
Indeed. In case it isn't obvious, you should define the acronym the first time you use it in the PEP.
This idea of local + single-statement names in the same expression strikes me as similar. Having that same sort of thing happening within a single statement gives me a headache:
spam = (spam, ((spam + spam as spam) + spam as spam), spam)
Explain that, if you will.
Sure. First, eliminate all the name bindings: [...]
The point is not that it cannot be explained, but that it requires careful thought to understand. An advantage of using just regular locals is that we don't have to think about the consequences of introducing two new scopes. Its all happening to the same "a" variable.
Ah, how surprising -- given the tone of this PEP, I honestly thought that it only applied to a single statement, not compound statements.
You should mention this much earlier.
Hmm. It's right up in the Rationale section, but without an example. Maybe an example would make it clearer?
Yes :-)
* An SLNB cannot be the target of any form of assignment, including augmented. Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
Not having that limitation creates worse problems, like that having "(1 as a)" somewhere can suddenly make an assignment fail. This is particularly notable with loop headers rather than simple statements.
How and why would it fail?
# See, for instance, Lib/pydoc.py if (re.search(pat, text) as match): print("Found:", match.group(0))
I do not believe that is actually code found in Lib/pydoc.py, since that will be a syntax error. What are you trying to say here?
Lib/pydoc.py has a more complicated version of the exact same functionality. This would be a simplification of a common idiom that can be found in the stdlib and elsewhere.
Then the PEP should show a "Before" and "After".
Performance costs =================
The cost of SLNBs must be kept to a minimum, particularly when they are not used; the normal case MUST NOT be measurably penalized.
What is the "normal case"?
The case where you're not using any SLNBs.
The PEP should make this more clear: "Any implementation must not include any significant performance cost to code that does not use statement-locals."
It takes time, even if only a nanosecond, to bind a value to a name, as opposed to *not* binding it to a name.
x = (spam as eggs)
has to be more expensive than
x = spam
because the first performs two name bindings rather than one. So "MUST NOT" already implies this proposal *must* be rejected. Perhaps you mean that there SHOULD NOT be a SIGNIFICANT performance penalty.
The mere fact that this feature exists in the language MUST NOT measurably impact Python run-time performance.
MUST NOT implies that if there is *any* measurable penalty, even a nano-second, the feature must be rejected. I think that's excessive. Surely a nanosecond cost for the normal case is a reasonable tradeoff if it buys us better expressiveness? Beware of talking in absolutes unless you really mean them. Besides, as soon as you talk performance, the question has to be, which implementation? Of course we don't want to necessarily impose unreasonable performance and maintence costs on any implementation. But surely performance cost is a quality of implementation issue. It ought to be a matter of trade-offs: is the benefit sufficient to make up for the cost? -- Steve
On 24 March 2018 at 04:09, Chris Angelico
On Sat, Mar 24, 2018 at 2:00 AM, Steven D'Aprano
wrote: I see you haven't mentioned anything about Nick Coglan's (long ago) concept of a "where" block. If memory serves, it would be something like:
value = x**2 + 2*x where: x = some expression
These are not necessarily competing, but they are relevant.
Definitely relevant, thanks. This is exactly what I'm looking for - related proposals that got lost in the lengthy threads on the subject. I'll mention it as another proposal, but if anyone has an actual post for me to reference, that would be appreciated (just to make sure I'm correctly representing it).
That one's a PEP reference: https://www.python.org/dev/peps/pep-3150/ If PEP 572 were to happen, then I'd see some variant of PEP 3150 as a potential future follow-on (allowing the statement local namespace for a simple statement to be populated in a trailing suite, without needing to make the case for statement locals in the first place). If inline local variable assignment were to happen instead, then PEP 3150 would continue to face the double hurdle of pitching both the semantic benefits of statement locals, while also pitching a syntax for defining them. FWIW, I like this version of the statement local proposal, and think it would avoid a lot of the quirks that otherwise arise when allowing expression level assignments. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 24 March 2018 at 14:41, Steven D'Aprano
Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. (They can technically also shadow each other, though actually doing
On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote: this
should not be encouraged.)
That seems weird.
Which part? That they shadow, or that they can shadow each other?
Shadowing themselves.
I'm still not convinced these should just shadow local variables. Of course locals will shadow nonlocals, which shadow globals, which shadow builtins. I'm just not sure that we gain much (enough?) to justify adding a new scope between what we already have:
proposed statement-local local nonlocal class (only during class statement) global builtins
I think that needs justification by more than just "it makes the implementation easier".
Introducing the new scoping behaviour doesn't make the implementation easier, it makes it harder. However, there are specific aspects of how that proposed new scope works (like not being visible from nested scopes) that make the implementation easier, since they eliminate a whole swathe of otherwise complicated semantic questions :) At a user experience level, the aim of the scoping limitation is essentially to help improve "code snippet portability". Consider the following piece of code: squares = [x**2 for x in iterable] In Python 2.x, you not only have to check whether or not you're already using "squares" for something, you also need to check whether or not you're using "x", since the iteration variable leaks. In Python 3.x, you only need to check for "squares" usage, since the comprehension has its own inner scope, and any "x" binding you may have defined will be shadowed instead of being overwritten. For PEP 572, the most directly comparable example is code like this: # Any previous binding of "m" is lost completely on the next line m = re.match(...) if m: print(m.groups(0)) In order to re-use that snippet, you need to double-check the surrounding code and make sure that you're not overwriting an "m" variable already used somewhere else in the current scope. With PEP 572, you don't even need to look, since visibility of the "m" in the following snippet is automatically limited to the statement itself: if (re.match(...) as m): print(m.groups(0)) # Any previous binding of "m" is visible again here, and hence a common source of bugs is avoided :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Mar 24, 2018 at 3:41 PM, Steven D'Aprano
To keep this a manageable length, I've trimmed vigourously. Apologies in advance if I've been too enthusiastic with the trimming :-)
On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote:
No, I haven't yet. Sounds like a new section is needed. Thing is, there's a HUGE family of C-like and C-inspired languages that allow assignment expressions, and for the rest, I don't have any personal experience. So I need input from people: what languages do you know of that have small-scope name bindings like this?
I don't know if this counts as "like this", but Lua has a do...end block that introduces a new scope. Something like this:
x = 1 do: x = 2 print(x) # prints 2 end print(x) # prints 1
I think that's a neat concept, but I'm struggling to think what I would use it for.
Okay. I'll leave off for now, but if the split of PEPs happens, I'll need to revisit that.
result = (func(x), func(x)+1, func(x)*2)
True, but outside of comprehensions, the most obvious response is "just add another assignment statement". You can't do that in a list comp (or equivalently in a genexp or dict comp).
Yes you can: your PEP gives equivalents that work fine for list comps, starting with factorising the duplicate code out into a helper function, to using a separate loop to get assignment:
[(spam, spam+1) for x in values for spam in (func(x),)]
[(spam, spam+1) for spam in (func(x) for x in values)]
They are the equivalent to "just add another assignment statement" for comprehensions.
They might be mechanically equivalent. They are not syntactically equivalent. This PEP is not about "hey let's do something in Python that's utterly impossible to do". It's "here's a much tidier way to spell something that currently has to be ugly".
Strictly speaking, there's never a time that we cannot use a new assignment statement. But sometimes it is annoying or inconvenient. Consider a contrived example:
TABLE = [ alpha, beta, gamma, delta, ... func(omega) + func(omega)**2 + func(omega)**3, ]
Yes, I can pull out the duplication:
temp = function(omega) TABLE = [ alpha, beta, gamma, delta, ... temp + temp**2 + temp**3, ]
but that puts the definition of temp quite distant from its use. So this is arguably nicer:
TABLE = [ alpha, beta, gamma, delta, ... (func(omega) as temp) + temp**2 + temp**3, ]
Right. Definitely advantageous (and another reason not to go with the comprehension-specific options).
Just as function-local names shadow global names for the scope of the function, statement-local names shadow other names for that statement. (They can technically also shadow each other, though actually doing this should not be encouraged.)
That seems weird.
Which part? That they shadow, or that they can shadow each other?
Shadowing themselves.
I'm still not convinced these should just shadow local variables. Of course locals will shadow nonlocals, which shadow globals, which shadow builtins. I'm just not sure that we gain much (enough?) to justify adding a new scope between what we already have:
proposed statement-local local nonlocal class (only during class statement) global builtins
I think that needs justification by more than just "it makes the implementation easier".
Nick has answered this part better than I can, so I'll just say "yep, read his post". :)
Shadowing is the same as nested functions (including comprehensions, since they're implemented with functions); and if SLNBs are *not* to shadow each other, the only way is to straight-up disallow it.
Or they can just rebind to the same (statement-)local. E.g.:
while ("spam" as x): assert x == "spam" while ("eggs" as x): assert x == "eggs" break assert x == "eggs"
That means that sometimes, ``while ("eggs" as x):`` creates a new variable, and sometimes it doesn't. Why should that be? If you change the way that "spam" is assigned to x, the semantics of the inner 'while' block shouldn't change. It creates a subscope, it uses that subscope, the subscope expires. Curtain comes down. By your proposal, you have to check whether 'x' is shadowing some other variable, and if so, what type. By mine, it doesn't matter; regardless of whether 'x' existed or not, regardless of whether there's any other x in any other scope, that loop behaves the same way. Function-local names give the same confidence. It doesn't matter what names you use inside a function (modulo 'global' or 'nonlocal' declarations) - they quietly shadow anything from the outside. You need only care about module names duplicating local names if you actually need to use both *in the same context*. Same with built-ins; it's fine to say "id = 42" inside a function as long as you aren't going to also use the built-in id() function in that exact function. Code migration is easy.
Why can they not be used in closures? I expect that's going to cause a lot of frustration.
Conceptually, the variable stops existing at the end of that statement. It makes for some oddities, but fewer oddities than every other variant that I toyed with. For example, does this create one single temporary or many different temporaries?
def f(): x = "outer" funcs = {} for i in range(10): if (g(i) as x) > 0: def closure(): return x funcs[x] = closure
I think the rule should be either:
- statement-locals actually *are* locals and so behave like locals;
- statement-locals introduce a new scope, but still behave like locals with respect to closures.
No need to introduce two separate modes of behaviour. (Or if there is such a need, then the PEP should explain it.)
That would basically mean locking in some form of semantics. For your first example, you're locking in the rule that "(g(i) as x)" is exactly the same as "x = g(i)", and you HAVE to then allow that this will potentially assign to global or nonlocal names as well (subject to the usual rules). In other words, you have assignment-as-expression without any form of subscoping. This is a plausible stance and may soon be becoming a separate PEP. But for your second, you're locking in the same oddities that a 'with' block has: that a variable is being "created" and "destroyed", yet it sticks around for the rest of the function, just in case. It's a source of some confusion to people that the name used in a 'with' statement is actually still valid afterwards. Or does it only stick around if there is a function to close over it? Honestly, I really want to toss this one into the "well don't do that" basket, and let the semantics be dictated by simplicity and cleanliness even if it means that a closure doesn't see that variable.
"SLNB"? Undefined acronym. What is it? I presume it has something to do with the single-statement variable.
Statement-Local Name Binding, from the title of the PEP. (But people probably don't read titles.)
Indeed. In case it isn't obvious, you should define the acronym the first time you use it in the PEP.
Once again, I assumed too much of people. Expected them to actually read the stuff they're discussing. And once again, the universe reminds me that people aren't like that. Ah well. Will fix that next round of edits.
* An SLNB cannot be the target of any form of assignment, including augmented. Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
Not having that limitation creates worse problems, like that having "(1 as a)" somewhere can suddenly make an assignment fail. This is particularly notable with loop headers rather than simple statements.
How and why would it fail?
a = (1 as a) With current semantics, this is equivalent to "a = 1". If assignment went into the SLNB, it would be equivalent to "pass". Which do you expect it to do?
MUST NOT implies that if there is *any* measurable penalty, even a nano-second, the feature must be rejected. I think that's excessive. Surely a nanosecond cost for the normal case is a reasonable tradeoff if it buys us better expressiveness?
Steve, you know how to time a piece of code. You debate these kinds of points on python-list frequently. Are you seriously trying to tell me that you could measure a single nanosecond in regular compiling and running of Python code? With my current implementation, there is an extremely small cost during compilation (a couple of checks of a pointer in a structure, and if it's never changed from its initial NULL, nothing else happens), and zero cost at run time. I believe that this counts as "no measurable penalty".
Beware of talking in absolutes unless you really mean them.
Besides, as soon as you talk performance, the question has to be, which implementation?
Of course we don't want to necessarily impose unreasonable performance and maintence costs on any implementation. But surely performance cost is a quality of implementation issue. It ought to be a matter of trade-offs: is the benefit sufficient to make up for the cost?
I don't see where this comes in. Let's say that Jython can't implement this feature without a 10% slowdown in run-time performance even if these subscopes aren't used. What are you saying the PEP should say? That it's okay for this feature to hurt performance by 10%? Then it should be rightly rejected. Or that Jython is allowed to ignore this feature? Or what? ChrisA
On 3/24/2018 5:49 AM, Chris Angelico wrote:
On Sat, Mar 24, 2018 at 3:41 PM, Steven D'Aprano
wrote: To keep this a manageable length, I've trimmed vigourously. Apologies in advance if I've been too enthusiastic with the trimming :-) On Sat, Mar 24, 2018 at 05:09:54AM +1100, Chris Angelico wrote:
...
"SLNB"? Undefined acronym. What is it? I presume it has something to do with the single-statement variable.
Statement-Local Name Binding, from the title of the PEP. (But people probably don't read titles.)
Indeed. In case it isn't obvious, you should define the acronym the first time you use it in the PEP.
Once again, I assumed too much of people. Expected them to actually read the stuff they're discussing. And once again, the universe reminds me that people aren't like that. Ah well. Will fix that next round of edits.
To be fair to the readers, you don't indicate you're going to use part of the title as an acronym later. I certainly didn't get it, either, and I read the title and PEP. So I'm in the group of people you assumed too much of. The traditional way to specify this would be to change part of the title or first usage to: "Statement-Local Name Binding (SLNB)". Which is a signal to the reader that you're going to use this later and they should remember it. I don't know if it's frowned upon, but I wouldn't put this in the title. Instead, I'd put it in the body of the PEP on first usage. And I'd also make that usage in the first paragraph, instead of many paragraphs in. Eric
On 24 March 2018 at 09:49, Chris Angelico
Of course we don't want to necessarily impose unreasonable performance and maintence costs on any implementation. But surely performance cost is a quality of implementation issue. It ought to be a matter of trade-offs: is the benefit sufficient to make up for the cost?
I don't see where this comes in. Let's say that Jython can't implement this feature without a 10% slowdown in run-time performance even if these subscopes aren't used. What are you saying the PEP should say? That it's okay for this feature to hurt performance by 10%? Then it should be rightly rejected. Or that Jython is allowed to ignore this feature? Or what?
I think the PEP should confirm that there's not expected to be a showstopper performance cost in implementing this feature in other Python implementations. That doesn't have to be a big deal - reaching out to the Jython, PyPy, Cython etc implementors and asking them for a quick sanity check that this doesn't impose unmanageable overheads should be sufficient. No need to make this too dogmatic. Paul
On Sat, Mar 24, 2018 at 07:12:49PM +1000, Nick Coghlan wrote:
I think that needs justification by more than just "it makes the implementation easier".
Introducing the new scoping behaviour doesn't make the implementation easier, it makes it harder. [...]
Perhaps I had misunderstood something Chris had said.
At a user experience level, the aim of the scoping limitation is essentially to help improve "code snippet portability".
Consider the following piece of code:
squares = [x**2 for x in iterable]
In Python 2.x, you not only have to check whether or not you're already using "squares" for something, you also need to check whether or not you're using "x", since the iteration variable leaks.
I hear you, and I understand that some people had problems with leakage, but in my own experience, this was not a problem I ever had. On the contrary, it was occasionally useful (what was the last value x took before the comprehension finished?). The change to Python 3 non-leaking behaviour has solved no problem for me but taken away something which was nearly always harmless and very occasionally useful. So I don't find this to be an especially compelling argument. But at least comprehensions are intended to be almost entirely self-contained, so it's not actively harmful. But I can't say the same for additional sub-function scopes.
For PEP 572, the most directly comparable example is code like this:
# Any previous binding of "m" is lost completely on the next line m = re.match(...) if m: print(m.groups(0))
In order to re-use that snippet, you need to double-check the surrounding code and make sure that you're not overwriting an "m" variable already used somewhere else in the current scope.
Yes. So what? I'm going to be doing that regardless of whether the interpreter places this use of m in its own scope or not. The scope as seen by the interpreter is not important. If all we cared about was avoiding name collisions, we could solve that by using 128-bit secret keys as variables: var_81c199e61e9f90fd023508aee3265ad9 We don't need multiple scopes to avoid name collisions, we just need to make sure they're all unique :-) But of course readability counts, and we write code to be read by people, not for the convenience of the interpreter. For that reason, whenever I paste a code snippet, I'm going to check the name and make a conscious decision whether to keep it or change it, and doing that means I have to check whether "m" is already in use regardless of whether or not the interpreter will keep the two (or more!) "m" variables. So this supposed benefit is really no benefit at all. I still am going to check "m" to see if it clashes. To the extent that this proposal to add sub-function scoping encourages people to do copy-paste coding without even renaming variables to something appropriate for the function they're pasted into, I think this will strongly hurts readability in the long run.
With PEP 572, you don't even need to look, since visibility of the "m" in the following snippet is automatically limited to the statement itself:
if (re.match(...) as m): print(m.groups(0)) # Any previous binding of "m" is visible again here, and hence a common source of bugs is avoided :)
Is this really a "common source of bugs"? Do you really mean to suggest that we should be able to copy and paste a code snippet into the middle of a function without checking how it integrates with the surrounding code? Because that's what it seems that you are saying. And not only that we should be able to do so, but that it is important enough that we should add a feature to encourage it? If people make a habit of pasting snippets of code into their functions without giving any thought to how it fits in with the rest of the function, then any resulting bugs are caused by carelessness and slap-dash technique, not the scoping rules of the language. The last thing I want to read is a function where the same name is used for two or three or a dozen different things, because the author happened to copy code snippets from elsewhere and didn't bother renaming things to be more appropriate. Nevermind whether the interpreter can keep track of which is which, I'm worried about *my* ability to keep track of which is which. I might be cynical about the professionalism and skills of the average programmer, but even I find it hard to believe that most people would actually do that. But since we're (surely?) going to be taking time to integrate the snippet with the rest of the function, the benefit of not having to check for duplicate variable names evaporates. We (hopefully!) will be checking for duplicates regardless of whether they are scoped to a single statement or not, because we don't want to read and maintain a function with the same name "x" representing a dozen different things at different times. I'm not opposed to re-using variable names for two different purposes within a single function. But when I do it, I do it because I made a conscious decision that: (1) the name is appropriate for both purposes; and (2) re-using the name does not lead to confusion or make the function hard to read. I don't re-use names because I've copied some snippet and can't be bothered to change the names. And I don't think we should be adding a feature to enable and justify that sort of poor practice. Comprehensions have their own scope, and that's at least harmless, if not beneficial, because they are self-contained single expressions. But this would add separate scopes to blocks: def function(): x = 1 if (spam as x): ... while (ham as x): ... # much later, deep in the function # possibly after some or all of those blocks have ended ... process(x) # which x is this? This would be three different variables all with the same name "x". To track the current value of x I have to track each of the x variables and which is currently in scope. I don't think we need sub-function scoping. I think it adds more complexity that outweighs whatever benefit it gives. -- Steve
On Sat, Mar 24, 2018 at 08:49:08PM +1100, Chris Angelico wrote:
[(spam, spam+1) for x in values for spam in (func(x),)]
[(spam, spam+1) for spam in (func(x) for x in values)]
They are the equivalent to "just add another assignment statement" for comprehensions.
They might be mechanically equivalent. They are not syntactically equivalent. This PEP is not about "hey let's do something in Python that's utterly impossible to do". It's "here's a much tidier way to spell something that currently has to be ugly".
For the record, I don't think either of those are ugly. The first is a neat trick, but the second in particular is a natural, elegant and beautiful way of doing it in a functional style. And the beauty of it is, if it ever becomes too big and unwieldy for a single expression, it is easy to *literally* "just add another assignment statement": eggs = (long_and_complex_expression_with(x) for x in values) [(spam, spam+1) for spam in eggs)] So I stand by my claim that even for comprehensions, "just add another assignment statement" is always an alternative.
while ("spam" as x): assert x == "spam" while ("eggs" as x): assert x == "eggs" break assert x == "eggs"
That means that sometimes, ``while ("eggs" as x):`` creates a new variable, and sometimes it doesn't. Why should that be?
I'm not following you. If we talk implementation for a moment, my proposal is that x is just a regular local variable. So the CPython compiler sees (... as x) in the code and makes a slot for it in the function. (Other implementations may do differently.) Whether or not that local slot gets filled with a value depends on whether or not the specific (... as x) actually gets executed or not. That's no different from any other binding operation. If x is defined as global, then (... as x) will bind to the global, not the local, but otherwise will behave the same. [...]
Function-local names give the same confidence. It doesn't matter what names you use inside a function (modulo 'global' or 'nonlocal' declarations) - they quietly shadow anything from the outside.
Yes, I get functions, and I think function-scope is a sweet spot between too few scopes and too many. Remember the bad old days of BASIC when all variables were application-global? Even if you used GOSUB as a second-rate kind of function, all the variables were still global. On the other hand, introducing sub-function scopes is, I strongly believe, too many. [...]
I think the rule should be either:
- statement-locals actually *are* locals and so behave like locals;
- statement-locals introduce a new scope, but still behave like locals with respect to closures.
No need to introduce two separate modes of behaviour. (Or if there is such a need, then the PEP should explain it.)
That would basically mean locking in some form of semantics. For your first example, you're locking in the rule that "(g(i) as x)" is exactly the same as "x = g(i)", and you HAVE to then allow that this will potentially assign to global or nonlocal names as well (subject to the usual rules). In other words, you have assignment-as-expression without any form of subscoping. This is a plausible stance and may soon be becoming a separate PEP.
Well, if we really wanted to, we could ban (expression as name) where name was declared global, but why bother?
But for your second, you're locking in the same oddities that a 'with' block has: that a variable is being "created" and "destroyed", yet it sticks around for the rest of the function, just in case.
Where is it documented that with blocks destroy variables? They don't. `with expression as name` is a name-binding operation no different from `name = expression` and the others. With the sole special case of except blocks auto-magically deleting the exception name, the only way to unbind a name is to call `del`. What you're describing is not an oddity, but the standard way variables work in Python, and damn useful too. I have code that requires that the `with` variable is not unbound at the end of the block.
It's a source of some confusion to people that the name used in a 'with' statement is actually still valid afterwards.
The difference between "import foo" and "from foo import bar" is source of some confusion to some people. I should know, because I went through that period myself. Just because "some people" make unjustified assumptions about the semantics of a language feature doesn't necessarily mean the language feature is wrong or harmful.
Or does it only stick around if there is a function to close over it?
No, there's no need for a closure: py> with open("/tmp/foo", "w") as f: ... pass ... py> f.closed True py> f.name '/tmp/foo'
Honestly, I really want to toss this one into the "well don't do that" basket, and let the semantics be dictated by simplicity and cleanliness even if it means that a closure doesn't see that variable.
If `(expression as name)` just bounds to a local, this is a non-problem. [...]
Indeed. In case it isn't obvious, you should define the acronym the first time you use it in the PEP.
Once again, I assumed too much of people. Expected them to actually read the stuff they're discussing. And once again, the universe reminds me that people aren't like that. Ah well. Will fix that next round of edits.
Sorry I don't have time to read that paragraph, so I'll just assume you are thanking me for pointing out your terrible error and offering profuse apologies. *wink*
* An SLNB cannot be the target of any form of assignment, including augmented.
Attempting to do so will remove the SLNB and assign to the fully-scoped name.
What's the justification for this limitation?
Not having that limitation creates worse problems, like that having "(1 as a)" somewhere can suddenly make an assignment fail. This is particularly notable with loop headers rather than simple statements.
How and why would it fail?
a = (1 as a)
With current semantics, this is equivalent to "a = 1". If assignment went into the SLNB, it would be equivalent to "pass". Which do you expect it to do?
Sorry, I don't follow this. If assignment goes into the statement-local, then it would be equivalent to: statement-local a = 1 not pass. Anyway, this confusion disappears if a is just a local. Then it is just: local a = 1 # the right hand side (1 as a) local a = 1 # the left hand side a = ... which presumably some interpreters could optimize down to a single assignment. If they can be bothered.
MUST NOT implies that if there is *any* measurable penalty, even a nano-second, the feature must be rejected. I think that's excessive. Surely a nanosecond cost for the normal case is a reasonable tradeoff if it buys us better expressiveness?
Steve, you know how to time a piece of code. You debate these kinds of points on python-list frequently. Are you seriously trying to tell me that you could measure a single nanosecond in regular compiling and running of Python code?
On my computer? Not a hope. But some current generation computers have sub-nanosecond CPU clock rates, and 3.7 is due to have new timers with nanosecond resolution: https://www.python.org/dev/peps/pep-0564/ I have no difficulty in believing that soon, if not right now, people will have sufficiently fast computers that yes, a nanosecond difference could be reliably measured with sufficient care. [...]
Of course we don't want to necessarily impose unreasonable performance and maintence costs on any implementation. But surely performance cost is a quality of implementation issue. It ought to be a matter of trade-offs: is the benefit sufficient to make up for the cost?
I don't see where this comes in. Let's say that Jython can't implement this feature without a 10% slowdown in run-time performance even if these subscopes aren't used.
Unlikely, but for the sake of the argument, okay.
What are you saying the PEP should say? That it's okay for this feature to hurt performance by 10%? Then it should be rightly rejected. Or that Jython is allowed to ignore this feature? Or what?
That's really for Guido to decide whether the benefit is worth the (hypothetical) cost. But why single out this feature from every other syntactic feature added to Python over its history? We have never before, as far as I can tell, demanded that a feature prove that every Python implmentation be able to support the feature with ZERO performance cost before accepting the PEP. Normally, we introduce a new feature, and expect that like any new code, the first version may not be the most efficient, but subsequent versions will be faster. The first few versions of Python 3 were significant slower than Python 2. Normally we make performance a trade-off: it's okay to make certain things a bit slower if there are sufficient other benefits. I still don't understand why you think that sort of tradeoff doesn't apply here. Effectively you seem to be saying that the value of this proposed feature is so infinitesimally small that we shouldn't accept *any* runtime cost, no matter how small, to gain this feature. -- Steve
On Sat, Mar 24, 2018 at 07:12:49PM +1000, Nick Coghlan wrote: [...]
At a user experience level, the aim of the scoping limitation is essentially to help improve "code snippet portability".
Consider the following piece of code:
squares = [x**2 for x in iterable]
In Python 2.x, you not only have to check whether or not you're already using "squares" for something, you also need to check whether or not you're using "x", since the iteration variable leaks. [...]
For PEP 572, the most directly comparable example is code like this:
# Any previous binding of "m" is lost completely on the next line m = re.match(...) if m: print(m.groups(0))
In order to re-use that snippet, you need to double-check the surrounding code and make sure that you're not overwriting an "m" variable already used somewhere else in the current scope. Yes. So what? I'm going to be doing that regardless of whether the interpreter places this use of m in its own scope or not. The scope as seen by the interpreter is not important. Good for you. But the proposed scoping rules are an extra safeguard for
On 24/03/2018 14:44, Steven D'Aprano wrote: programmers who are less conscientious than you, or for anyone (including you) who is short of time, or misses something. An extra level of protection against introducing a bug is IMO a Good Thing.
If all we cared about was avoiding name collisions, we could solve that by using 128-bit secret keys as variables:
var_81c199e61e9f90fd023508aee3265ad9 Good luck with that. :-)
We don't need multiple scopes to avoid name collisions, we just need to make sure they're all unique :-) You could use the same argument to justify "We don't need separate local and global scopes". But we have them, and it makes it easier and safer to cut-and-paste functions. I assume you don't consider that a Bad Thing.
But of course readability counts, and we write code to be read by people, not for the convenience of the interpreter.
For that reason, whenever I paste a code snippet, I'm going to check the name and make a conscious decision whether to keep it or change it, and doing that means I have to check whether "m" is already in use regardless of whether or not the interpreter will keep the two (or more!) "m" variables. So this supposed benefit is really no benefit at all. I still am going to check "m" to see if it clashes. Same argument, same reply. Good for you - but there's nothing wrong with an extra safety net. And you make essentially the same point a few more times, I won't repeat myself further.
To the extent that this proposal to add sub-function scoping encourages people to do copy-paste coding without even renaming variables to something appropriate for the function they're pasted into, I think this will strongly hurts readability in the long run. I think it will aid readability, precisely for the reason Nick gives: you need to make fewer checks whether variables are or are not used elsewhere.
With PEP 572, you don't even need to look, since visibility of the "m" in the following snippet is automatically limited to the statement itself:
if (re.match(...) as m): print(m.groups(0)) # Any previous binding of "m" is visible again here, and hence a common source of bugs is avoided :) Is this really a "common source of bugs"?
Do you really mean to suggest that we should be able to copy and paste a code snippet into the middle of a function without checking how it integrates with the surrounding code? Because that's what it seems that you are saying. And not only that we should be able to do so, but that it is important enough that we should add a feature to encourage it?
If people make a habit of pasting snippets of code into their functions without giving any thought to how it fits in with the rest of the function, then any resulting bugs are caused by carelessness and slap-dash technique, not the scoping rules of the language.
The last thing I want to read is a function where the same name is used for two or three or a dozen different things, because the author happened to copy code snippets from elsewhere and didn't bother renaming things to be more appropriate. Nevermind whether the interpreter can keep track of which is which, I'm worried about *my* ability to keep track of which is which.
I might be cynical about the professionalism and skills of the average programmer, but even I find it hard to believe that most people would actually do that. But since we're (surely?) going to be taking time to integrate the snippet with the rest of the function, the benefit of not having to check for duplicate variable names evaporates.
We (hopefully!) will be checking for duplicates regardless of whether they are scoped to a single statement or not, because we don't want to read and maintain a function with the same name "x" representing a dozen different things at different times.
I'm not opposed to re-using variable names for two different purposes within a single function. But when I do it, I do it because I made a conscious decision that:
(1) the name is appropriate for both purposes; and
(2) re-using the name does not lead to confusion or make the function hard to read.
I don't re-use names because I've copied some snippet and can't be bothered to change the names. And I don't think we should be adding a feature to enable and justify that sort of poor practice.
Comprehensions have their own scope, and that's at least harmless, if not beneficial, because they are self-contained single expressions. But this would add separate scopes to blocks:
def function(): x = 1 if (spam as x): ... while (ham as x): ... # much later, deep in the function # possibly after some or all of those blocks have ended ... process(x) # which x is this?
This would be three different variables all with the same name "x". To track the current value of x I have to track each of the x variables and which is currently in scope.
I don't think we need sub-function scoping. I think it adds more complexity that outweighs whatever benefit it gives.
On 24/03/2018 16:02, Steven D'Aprano wrote:
Yes, I get functions, and I think function-scope is a sweet spot between too few scopes and too many. Remember the bad old days of BASIC when all variables were application-global? Even if you used GOSUB as a second-rate kind of function, all the variables were still global.
On the other hand, introducing sub-function scopes is, I strongly believe, too many.
We are all entitled to our beliefs. But the decision was made to stop a for-variable from leaking from a list comprehension - you may not agree with that decision, but it was presumably a reasonable one. Using SLNBs that don't leak into the surrounding local scope is ISTM a similar decision, and one that, if made, would be made for similar reasons. Rob Cliffe
On 03/24/2018 09:27 AM, Rob Cliffe via Python-ideas wrote:
On 24/03/2018 14:44, Steven D'Aprano wrote:
On Sat, Mar 24, 2018 at 07:12:49PM +1000, Nick Coghlan wrote:
For PEP 572, the most directly comparable example is code like this:
# Any previous binding of "m" is lost completely on the next line m = re.match(...) if m: print(m.groups(0))
In order to re-use that snippet, you need to double-check the surrounding code and make sure that you're not overwriting an "m" variable already used somewhere else in the current scope.
Yes. So what? I'm going to be doing that regardless of whether the interpreter places this use of m in its own scope or not. The scope as seen by the interpreter is not important.
Good for you. But the proposed scoping rules are an extra safeguard for programmers who are less conscientious than you, or for anyone (including you) who is short of time, or misses something. An extra level of protection against introducing a bug is IMO a Good Thing.
But it's not a free thing. Our cars have seat belts, not six-point restraints, and either way the best practice is to be aware of one's surroundings, not rely on the safeguards to protect us against carelessness.
To the extent that this proposal to add sub-function scoping encourages people to do copy-paste coding without even renaming variables to something appropriate for the function they're pasted into, I think this will strongly hurts readability in the long run.
I think it will aid readability, precisely for the reason Nick gives: you need to make fewer checks whether variables are or are not used elsewhere.
Extra levels of intermingled scope are extra complication (for humans, too!); extra complication does not (usually) help readability -- I agree with D'Aprano that this is not a helpful complication. -- ~Ethan~
This is a super complex topic. There are at least three separate levels of critique possible, and all are important. First there is the clarity of the PEP. Steven D'Aprano has given you great detailed feedback here and you should take it to heart (even if you disagree with his opinion about the specifics). I'd also recommend treating some of the "rejected alternatives" more like "open issues" (which are to be resolved during the review and feedback cycle). And you probably need some new terminology -- the abbreviation SNLB is awkward (I keep having to look it up), and I think we need a short, crisp name for the new variable type. Then there is the issue of syntax. While `(f() as x)` is a cool idea (and we should try to recover who deserves credit for first proposing it), it's easy to overlook in the middle of an exception. It's arguably more confusing because the scoping rules you propose are so different from the existing three other uses of `as NAME` -- and it causes an ugly wart in the PEP because two of those other uses are syntactically so close that you propose to ban SNLBs there. When it comes to alternatives, I think we've brainwashed ourselves into believing that inline assignments using `=` are evil that it's hard to objectively explain why it's bad -- we're just repeating the mantra here. I wish we could do more quantitative research into how bad this actually is in languages that do have it. We should also keep an open mind about alternative solutions present in other languages. Here it would be nice if we had some qualitative research into what other languages actually do (both about syntax and about semantics, for sure). The third issue is that of semantics. I actually see two issues here. One is whether we need a new scope (and whether it should be as weird as proposed). Steven seems to think we don't. I'm not sure that the counter-argument that we're already down that path with comprehension scopes is strong enough. The other issue is that, if we decide we *do* need (or want) statement-local scopes, the PEP must specify the exact scope of a name bound at any point in a statement. E.g. is `d[x] = (f() as x)` valid? And what should we do if a name may or may not be bound, as in `if (f(1) as x) or (f(2) as y): g(y)` -- should that be a compile-time error (since we can easily tell that y isn't always defined when `g(y)` is called) or a runtime error (as we do for unbound "classic" locals)? And there are further details, e.g. are these really not allowed to be closures? And are they single-assignment? (Or can you do e.g. `(f(1) as x) + (f(2) as x)`?) I'm not sure if there are still places in Python where evaluation order is unspecified, but I suspect there are (at the very least the reference manual is incomplete in specifying the exact rules, e.g. I can't find words specifying the evaluation order in a slice). We'll need to fix all of those, otherwise the use of local name bindings in such cases would have unspecified semantics (or the evaluation order could suddenly shift when a local name binding was added). So, there are lots of interesting questions! I do think there are somewhat compelling use cases; more than comprehensions (which I think are already over-used) I find myself frequently wishing for a better way to write m = pat.match(text) if m: g = m.group(0) if check(g): # Some check that's not easily expressed as a regex print(g) It would be nice if I could write that as if (m = pat.match(text)) and check((g = m.group(0))): print(g) or if (pat.match(text) as m) and check((m.group(0) as g)): print(g) -- --Guido van Rossum (python.org/~guido)
On 25 March 2018 at 15:34, Guido van Rossum
This is a super complex topic. There are at least three separate levels of critique possible, and all are important.
First there is the clarity of the PEP. Steven D'Aprano has given you great detailed feedback here and you should take it to heart (even if you disagree with his opinion about the specifics). I'd also recommend treating some of the "rejected alternatives" more like "open issues" (which are to be resolved during the review and feedback cycle). And you probably need some new terminology -- the abbreviation SNLB is awkward (I keep having to look it up), and I think we need a short, crisp name for the new variable type.
I've used "ephemeral name binding" before, but that's even longer than saying ess-ell-enn-bee (for Statement Local Name Binding), and also doesn't feel right for a proposal that allows the binding to persist for the entire suite in compound statements. Given the existing namespace stack of builtin<-global<-nonlocal<-local, one potential short name would be "sublocal", to indicate that these references are even more local than locals (they're *so* local, they don't even appear in locals()!).
Then there is the issue of syntax. While `(f() as x)` is a cool idea (and we should try to recover who deserves credit for first proposing it),
I know I first suggested it years ago, but I don't recall if anyone else proposed it before me.
it's easy to overlook in the middle of an exception.
That I agree with - the more examples I've seen using it, the less I've liked how visually similar "(a as b)" is to "(a and b)".
It's arguably more confusing because the scoping rules you propose are so different from the existing three other uses of `as NAME` -- and it causes an ugly wart in the PEP because two of those other uses are syntactically so close that you propose to ban SNLBs there. When it comes to alternatives, I think we've brainwashed ourselves into believing that inline assignments using `=` are evil that it's hard to objectively explain why it's bad -- we're just repeating the mantra here. I wish we could do more quantitative research into how bad this actually is in languages that do have it. We should also keep an open mind about alternative solutions present in other languages. Here it would be nice if we had some qualitative research into what other languages actually do (both about syntax and about semantics, for sure).
Writing "name = expr" when you meant "name == expr" remains a common enough source of bugs in languages that allow it that I still wouldn't want to bring that particular opportunity for semantically significant typos over to Python. Using "name := expr" doesn't have that problem though (since accidentally adding ":" is a much harder typo to make than leaving out "="), and has the added bonus that we could readily restrict the LHS to single names. I also quite like the way it reads in conditional expressions: value = f() if (f := lookup_function(args)) is not None else default And if we do end up going with the approach of defining a separate sublocal namespace, the fact that "n := ..." binds a sublocal, while "n = ..." and "... as n" both bind regular locals would be clearer than having the target scope of "as" be context dependent.
The third issue is that of semantics. I actually see two issues here. One is whether we need a new scope (and whether it should be as weird as proposed). Steven seems to think we don't. I'm not sure that the counter-argument that we're already down that path with comprehension scopes is strong enough.
The other issue is that, if we decide we *do* need (or want) statement-local scopes, the PEP must specify the exact scope of a name bound at any point in a statement. E.g. is `d[x] = (f() as x)` valid? And what should we do if a name may or may not be bound, as in `if (f(1) as x) or (f(2) as y): g(y)` -- should that be a compile-time error (since we can easily tell that y isn't always defined when `g(y)` is called) or a runtime error (as we do for unbound "classic" locals)? And there are further details, e.g. are these really not allowed to be closures? And are they single-assignment? (Or can you do e.g. `(f(1) as x) + (f(2) as x)`?)
I think this need to more explicitly specify evaluation order applies regardless of whether we define a sublocal scope or not: expression level name binding in any form makes evaluation order (and evaluation scope!) matter in ways that we can currently gloss over, since you need to be relying on functions with side effects in order to even observe the differences. If the expression level bindings are just ordinary locals, it does open up some potentially interesting order of evaluation testing techniques, though: expected_order = list(range(3)) actual_order = iter(expected_order) defaultdict(int)[(first := next(actual_order)):(second := next(actual_order)):(third := next(actual_order))] self.assertEqual([first, second, third], expected_order) With sublocals, you'd need to explicitly promote them to regular locals to get the same effect: expected_order = list(range(3)) actual_order = iter(expected_order) __, first, second, third = defaultdict(int)[(first := next(actual_order)):(second := next(actual_order)):(third := next(actual_order))], first, second, third self.assertEqual([first, second, third], expected_order) That said, it's debatable whether *either* of those is any clearer for that task than the status quo of just using list append operations: expected_order = list(range(3)) actual_order = [] defaultdict(int)[actual_order.append(0):actual_order.append(1):actual_order.append(2)] self.assertEqual(actual_order, expected_order)
I'm not sure if there are still places in Python where evaluation order is unspecified, but I suspect there are (at the very least the reference manual is incomplete in specifying the exact rules, e.g. I can't find words specifying the evaluation order in a slice). We'll need to fix all of those, otherwise the use of local name bindings in such cases would have unspecified semantics (or the evaluation order could suddenly shift when a local name binding was added).
One that surprised me earlier today is that it looks like we never transferred the generator expression wording about the scope of evaluation for the outermost iterable over to the sections describing comprehension evaluation - we only point out that the result subexpression evaluation and the iteration variable binding happen in a nested scope. (Although now I'm wondering if there might already be a docs tracker issue for that, and I just forgot about it) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Mar 25, 2018 at 4:34 PM, Guido van Rossum
This is a super complex topic. There are at least three separate levels of critique possible, and all are important.
Thank you for your detailed post. I'll respond to some of it here, and some more generally below.
First there is the clarity of the PEP. Steven D'Aprano has given you great detailed feedback here and you should take it to heart (even if you disagree with his opinion about the specifics). I'd also recommend treating some of the "rejected alternatives" more like "open issues" (which are to be resolved during the review and feedback cycle). And you probably need some new terminology -- the abbreviation SNLB is awkward (I keep having to look it up), and I think we need a short, crisp name for the new variable type.
Agreed that it needs a new name. I've been trying to avoid looking for something that's short-yet-inaccurate, and sticking to the accurate-but-unwieldy; perhaps Nick's "sublocal" will serve the purpose?
Then there is the issue of syntax. While `(f() as x)` is a cool idea (and we should try to recover who deserves credit for first proposing it), it's easy to overlook in the middle of an exception. It's arguably more confusing because the scoping rules you propose are so different from the existing three other uses of `as NAME` -- and it causes an ugly wart in the PEP because two of those other uses are syntactically so close that you propose to ban SNLBs there. When it comes to alternatives, I think we've brainwashed ourselves into believing that inline assignments using `=` are evil that it's hard to objectively explain why it's bad -- we're just repeating the mantra here. I wish we could do more quantitative research into how bad this actually is in languages that do have it. We should also keep an open mind about alternative solutions present in other languages. Here it would be nice if we had some qualitative research into what other languages actually do (both about syntax and about semantics, for sure).
Not qualitative, but anecdotal: I do sometimes have to remind my JavaScript students to check whether they've typed enough equals signs. And that's in a language where the normal comparison operator is ===. It's *still* not uncommon to see a comparison spelled =.
The third issue is that of semantics. I actually see two issues here. One is whether we need a new scope (and whether it should be as weird as proposed). Steven seems to think we don't. I'm not sure that the counter-argument that we're already down that path with comprehension scopes is strong enough. The other issue is that, if we decide we *do* need (or want) statement-local scopes, the PEP must specify the exact scope of a name bound at any point in a statement. E.g. is `d[x] = (f() as x)` valid?
Yes, it is. The sublocal name (I'm going to give this term a try and see how it works; if not, we can revert to "bullymong", err I mean "SLNB") remains valid for all retrievals until the end of the statement, which includes the assignment.
And what should we do if a name may or may not be bound, as in `if (f(1) as x) or (f(2) as y): g(y)` -- should that be a compile-time error (since we can easily tell that y isn't always defined when `g(y)` is called) or a runtime error (as we do for unbound "classic" locals)?
The way I've been thinking about it (and this is reflected in the reference implementation) is that 'y' becomes, in effect, a new variable that doesn't collide with any other 'y' in the same function or module or anything. For the duration of this statement, 'x' and 'y' are those special variables. So it's similar to writing this: def func(): x = f(1) if x: g(y) else: y = f(2) g(y) which will raise UnboundLocalError when x is true. The same behaviour happens here.
And there are further details, e.g. are these really not allowed to be closures? And are they single-assignment? (Or can you do e.g. `(f(1) as x) + (f(2) as x)`?)
Technically, what happens is that the second one creates _another_ sublocal name, whose scope begins from the point of assignment and goes to the end of the statement. Since this expression must all be within one statement, both sublocals will expire simultaneously, so it's effectively the same as reassigning to the same name, except that the old object won't be dereferenced until the whole statement ends. (And since Python-the-language doesn't guarantee anything about dereferenced object destruction timings, this will just be a point of curiosity.)
So, there are lots of interesting questions! I do think there are somewhat compelling use cases; more than comprehensions (which I think are already over-used) I find myself frequently wishing for a better way to write
m = pat.match(text) if m: g = m.group(0) if check(g): # Some check that's not easily expressed as a regex print(g)
It would be nice if I could write that as
if (m = pat.match(text)) and check((g = m.group(0))): print(g)
or
if (pat.match(text) as m) and check((m.group(0) as g)): print(g)
Agreed. I'm currently thinking that I need to do what several people have suggested and break this into two completely separate PEPs: 1) Sublocal namespacing 2) Assignment expressions Sublocal names can be used in a number of ways. There could be a "with sublocal EXPR as NAME:" syntax that actually disposes of the name binding at the end of the block, and "except Exception as e:" could shadow rather than unbinding. Maybe list comprehensions could change, too - instead of creating a function, they just create a sublocal scope. That may be the best way forward. I'm not sure. ChrisA
On 25 March 2018 at 17:18, Chris Angelico
Agreed. I'm currently thinking that I need to do what several people have suggested and break this into two completely separate PEPs:
1) Sublocal namespacing 2) Assignment expressions
Sublocal names can be used in a number of ways. There could be a "with sublocal EXPR as NAME:" syntax that actually disposes of the name binding at the end of the block,
The scoping affects the name binding rather than the expression evaluation, so I'd expect any such variant to be: with EXPR as sublocal NAME: ...
and "except Exception as e:" could shadow rather than unbinding. Maybe list comprehensions could change, too - instead of creating a function, they just create a sublocal scope.
That may be the best way forward. I'm not sure.
I think you can treat it as an open design question within the current PEP by tweaking the PEP title to be "Name binding as an expression". If we allow expression level name binding at all, it will be an either/or choice between binding to a new sublocal scope and binding regular locals, and you can handle that by describing sublocals as your current preferred option, but point out that the same *syntactic* idea could be adopted without introducing the sublocals semantics (in the latter case, the distinction created by the PEP would just be between "assignment statements" and "assignment expressions", rather than between "local assignments" and "sublocal assignments"). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Mar 25, 2018 at 05:00:37PM +1000, Nick Coghlan wrote:
Given the existing namespace stack of builtin<-global<-nonlocal<-local, one potential short name would be "sublocal", to indicate that these references are even more local than locals (they're *so* local, they don't even appear in locals()!).
If we go down this track, +1 on the name "sublocal". [...]
And if we do end up going with the approach of defining a separate sublocal namespace, the fact that "n := ..." binds a sublocal, while "n = ..." and "... as n" both bind regular locals would be clearer than having the target scope of "as" be context dependent.
The scope issue is a good argument for avoiding "as" if we have sublocal binding. One thing I like about the (expression as name) syntax is that the expression comes first. The Pascal-style := binding syntax reverses that. While we're bike-shedding, here are some alternatives to compare: target = default if (expression as name) is None else name target = default if (name := expression) is None else name target = default if (expression -> name) is None else name target = default if (name <- expression) is None else name The arrow assignment operators <- and -> are both used by R. A dedicated non-ASCII forward arrow is also used by some programmable calculators, including HP and TI. But let's not start using non-ASCII symbols yet. If we don't like a symbolic operator, we could channel BASIC from the 1970s and write something like this: target = default if (let expression = name) is None else name Pros: - requiring the keyword "let" prevents the "equals versus assignment" class of errors; - easier to search for a keyword than a symbolic operator; Cons: - more verbose; - looks like BASIC; - requires a new keyword. -- Steve
On Sun, Mar 25, 2018 at 12:18 AM, Chris Angelico
[...] Agreed. I'm currently thinking that I need to do what several people have suggested and break this into two completely separate PEPs:
1) Sublocal namespacing 2) Assignment expressions
Sublocal names can be used in a number of ways. There could be a "with sublocal EXPR as NAME:" syntax that actually disposes of the name binding at the end of the block, and "except Exception as e:" could shadow rather than unbinding. Maybe list comprehensions could change, too - instead of creating a function, they just create a sublocal scope.
That may be the best way forward. I'm not sure.
ChrisA
I don't think the PEP should be split up into two PEPs. The two topics are too closely related. But you can have separate discussions about each issue in the PEP. -- --Guido van Rossum (python.org/~guido)
[Chris Angelico
... Not qualitative, but anecdotal: I do sometimes have to remind my JavaScript students to check whether they've typed enough equals signs. And that's in a language where the normal comparison operator is ===. It's *still* not uncommon to see a comparison spelled =.
I wonder whether Guido remembers this ;-) In the very, very, VERY early days, Python didn't have "==". Plain single "=" was used for both assignment and equality testing. So way back then. using "=" for embedded assignment too was intractable on the face of it. I'm not clear on why it changed. I remember writing to Guido about how to disambiguate between the "bind" and "test for equality" intents in isolated expressions typed at the interactive prompt, and next thing I knew the language changed to use "==" for the latter. In any case, I really don't want to see plain "=" for embedded assignments now. It's been the source of some of the worst C debugging nightmares I've wasted months of my life tracking down. Here's one that baffled an office full of MIT grads for half a day before I noticed the real problem: assert(n=2); You can fill in the rest of the story yourself - but you'll miss the full extent of the agony it caused ;-) Guido's original intuition was right: regardless of programming experience, it remains sorely tempting to write "x = y" when equality testing is intended. To this day I routinely get a syntax error in Python when doing that by mistake. For which I'm eternally grateful. Any other way of spelling it would be preferable. Requiring parentheses around it isn't enough; e.g., if (x = 1) or (y = 2): would almost certainly not do what was intended either. There's also that many newcomers from C-like languages habitually put all `if` and `while` tests in parens. I'm fond enough of ":=". Icon used that for assignment (embedded or otherwise), and I don't recall any bugs due to that. It was hard to confuse for "==" (whether conceptual confusion or visual confusion). That was just prone to the _other_ problem with embedded assignments: staring and staring trying to find the code where a name was most recently bound - "oh! it was bound inside the third nested clause in the `while` test two pages back". So it would be nice to combine embedded assignment with some notion of limited scope - but I'm much more concerned that the spelling not be easily confusable with "==". But not really a fan of overly wordy spellings either. There you go! All the rest follows trivially from the Zen of Python ;-)
On Mon, Mar 26, 2018 at 10:40 AM, Tim Peters
Here's one that baffled an office full of MIT grads for half a day before I noticed the real problem:
assert(n=2);
You can fill in the rest of the story yourself - but you'll miss the full extent of the agony it caused ;-)
I have to confess that my eye jumped down to the code before reading all of the text above it, and as a result, I thought you were pointing out that "n=2" for assignment would conflict with named argument usage. Which it does, but that wasn't your point :) Is there any way that ":=" can legally occur in Python source (ignoring string literals) currently? A colon is always followed by a 'suite' or a 'test', neither of which can start with '=', and annotated assignment has to have something between the ':' and '='. If it's 100% unambiguous, it could be the solution to the current wonkiness with 'as' having multiple meanings; in fact, there would then be a new form of consistency: 'as' binds the special result of a statement, but ':=' binds arbitrary expressions. ChrisA
On Sun, Mar 25, 2018 at 4:40 PM, Tim Peters
[Chris Angelico
] ... Not qualitative, but anecdotal: I do sometimes have to remind my JavaScript students to check whether they've typed enough equals signs. And that's in a language where the normal comparison operator is ===. It's *still* not uncommon to see a comparison spelled =.
I wonder whether Guido remembers this ;-) In the very, very, VERY early days, Python didn't have "==". Plain single "=" was used for both assignment and equality testing. So way back then. using "=" for embedded assignment too was intractable on the face of it.
Wow, I did not remember this. In fact I had to track down the 0.9.1 release that's somewhere on the web to see for myself. :-) Should add this to the HOPL-IV paper if I end up writing it (I'm still far from decided either way).
I'm not clear on why it changed. I remember writing to Guido about how to disambiguate between the "bind" and "test for equality" intents in isolated expressions typed at the interactive prompt, and next thing I knew the language changed to use "==" for the latter.
Hm, that's probably why -- the desire for top-level expressions to allow comparison. Also probably the realization that this is one thing where (at the time) this particular difference with C/C++ was just annoying for most new users. I'm assuming that <>, the ancient alternate spelling for != (that Barry still misses), came from the same source: ABC ( https://homepages.cwi.nl/~steven/abc/qr.html#TESTS). But there was no compelling reason to remove <> (only to add !=) so it lingered until 3.0. Presumably ABC got both from Pascal ( https://www.tutorialspoint.com/pascal/pascal_relational_operators.htm).
In any case, I really don't want to see plain "=" for embedded assignments now. [...]
I'm fond enough of ":=". Icon used that for assignment (embedded or otherwise), and I don't recall any bugs due to that. It was hard to confuse for "==" (whether conceptual confusion or visual confusion).
Most languages I learned in the '70s used it: both Algols, Pascal. (Though not Fortran.)
That was just prone to the _other_ problem with embedded assignments: staring and staring trying to find the code where a name was most recently bound - "oh! it was bound inside the third nested clause in the `while` test two pages back". So it would be nice to combine embedded assignment with some notion of limited scope - but I'm much more concerned that the spelling not be easily confusable with "==". But not really a fan of overly wordy spellings either.
The "two pages back" problem can happen just as easy with regular assignments or for-loop control variables.
There you go! All the rest follows trivially from the Zen of Python ;-)
I gotta say I'm warming up to := in preference over 'as', *if* we're going to do this at all (not a foregone conclusion at all). The scope question is far from easy though. I find it particularly grating that an inline assignment occurs in an 'if' statement, its scope is the entire body of the 'if'. If that body is two pages long, by the end of it the reader (or even the writer!) may well have lost track of where it was defined and may be confused by the consequence past the end of the body. -- --Guido van Rossum (python.org/~guido)
On Mon, Mar 26, 2018 at 12:24 PM, Guido van Rossum
I gotta say I'm warming up to := in preference over 'as', *if* we're going to do this at all (not a foregone conclusion at all).
So am I, primarily due to its lack of syntactic ambiguities.
The scope question is far from easy though. I find it particularly grating that an inline assignment occurs in an 'if' statement, its scope is the entire body of the 'if'. If that body is two pages long, by the end of it the reader (or even the writer!) may well have lost track of where it was defined and may be confused by the consequence past the end of the body.
I think this one can be given to style guides. The useful situations (eg regex match capturing) are sufficiently valuable that the less-useful ones can just come along for the ride, just like "x = lambda: ..." is perfectly valid even though "def" would be preferable. ChrisA
[Tim]
I wonder whether Guido remembers this ;-) In the very, very, VERY early days, Python didn't have "==". Plain single "=" was used for both assignment and equality testing.
[Guido]
Wow, I did not remember this. In fact I had to track down the 0.9.1 release that's somewhere on the web to see for myself. :-) Should add this to the HOPL-IV paper if I end up writing it (I'm still far from decided either way).
See? I'm still good for _something_ sometimes ;-)
I'm not clear on why it changed. I remember writing to Guido about how to disambiguate between the "bind" and "test for equality" intents in isolated expressions typed at the interactive prompt, and next thing I knew the language changed to use "==" for the latter.
Hm, that's probably why -- the desire for top-level expressions to allow comparison. Also probably the realization that this is one thing where (at the time) this particular difference with C/C++ was just annoying for most new users.
I don't have my email from those days, and have futilely tried to recall details. IIRC, it had never been discussed on the mailing list before, or in any private emails before. It just popped up one day when I was working in a Python shell, and there was _something_ subtle about it. You wrote back and expressed disappointment - that you had really wanted to keep "=" for both purposes. I started writing a reply suggesting a way out of whatever-the-heck the problem was, but before I finished the reply the next day you had already changed the implementation! Things moved quickly back then :-) Anyway, if your time machine is in good working order, I'd be pleased if you went back and restored the original vision. If, e.g., we needed to type
(x = y) True
at the shell to get a top-level equality comparison, BFD. I can't believe it was _that_ simple, though.
I'm assuming that <>, the ancient alternate spelling for != (that Barry still misses), came from the same source: ABC (https://homepages.cwi.nl/~steven/abc/qr.html#TESTS). But there was no compelling reason to remove <> (only to add !=) so it lingered until 3.0. Presumably ABC got both from Pascal (https://www.tutorialspoint.com/pascal/pascal_relational_operators.htm).
Good inspirations! As noted next, at least Pascal used ":=" for assignment too.
... Most languages I learned in the '70s used it: both Algols, Pascal. (Though not Fortran.)
I mentioned Icon because I'm sure Pascal didn't have "embedded assignments" at all. Unsure about Algol, but I'd be surprised (certainly not Fortran). Icon has no "statements" at all: _everything_ in Icon is an expression, generating zero or more values. Embedded assignments are frequently used in idiomatic Icon, so I think it's especially relevant that I recall no bugs due to Icon's use of ":=" (for assignment) and "==" (for equality). Programmers simply never used one when the other was intended. In C, essentially everyone uses "=" when they intend "==" at times, and - as noted - I _still_ do that in Python regularly to this day. I'd be screwed if I got an unintended assignment instead of a SyntaxError.
... The "two pages back" problem can happen just as easy with regular assignments or for-loop control variables.
Yup, but eyeballs don't have to scan every square inch of the screen for those: `for` statement targets are easy to find, and assignment statement targets start flush with the first non-blank character of an assignment statement, where the eye naturally leaps to. When assignments can be embedded anywhere, you have to look everywhere to find them. But so it goes. Even if that can't be _stopped_, it's a matter of good practice to avoid making code inscrutable.
... I gotta say I'm warming up to := in preference over 'as', *if* we're going to do this at all (not a foregone conclusion at all).
I'm not assuming it will go in, I just want to nudge the PEP toward a proposal that doesn't suck so bad it's obviously doomed ;-) I'm uncertain whether I'd support it anyway. I do know that, e.g., if m := match(string) is not None: # do something with m violates my sense of outrage less than anything else I've seen ;-) And, ya, I'd _use_ it if it were implemented. But I can (continue to!) live without it.
The scope question is far from easy though. I find it particularly grating that an inline assignment occurs in an 'if' statement, its scope is the entire body of the 'if'. If that body is two pages long, by the end of it the reader (or even the writer!) may well have lost track of where it was defined and may be confused by the consequence past the end of the body.
See my "every square inch" above ;-) At least if the scope _is_ limited to the body of the `if`, it's far more limited than in C or Icon. Of course I'm more interested in whether it can be used to write clearer code than in whether it can be abused to write muddier code. List comprehensions leapt to mind there. They're wonderfully clear in prudent doses, but for a while half my Stackoverflow answers started by chiding the questioner for an irrational fear of writing obvious loops instead ;-)
On Sun, Mar 25, 2018 at 6:29 PM, Chris Angelico
On Mon, Mar 26, 2018 at 12:24 PM, Guido van Rossum
wrote: The scope question is far from easy though. I find it particularly grating that an inline assignment occurs in an 'if' statement, its scope is the entire body of the 'if'. If that body is two pages long, by the end of it the reader (or even the writer!) may well have lost track of where it was defined and may be confused by the consequence past the end of the body.
I think this one can be given to style guides. The useful situations (eg regex match capturing) are sufficiently valuable that the less-useful ones can just come along for the ride, just like "x = lambda: ..." is perfectly valid even though "def" would be preferable.
Not so fast. There's a perfectly reasonable alternative to sublocal scopes -- just let it assign to a local variable in the containing scope. That's the same as what Python does for for-loop variables. Note that for comprehensions it still happens to do the right thing (assuming we interpret the comprehension's private local scope to be the containing scope). This alternative has significant advantages in my view -- it doesn't require a whole new runtime mechanism to implement it (in CPython you can just generate bytecode to store and load locals), and it doesn't require a new section in the documentation to explain the new type of variable scope. Also it would make Steven happy. :-) Perhaps we could even remove the requirement to parenthesize the new form of assignment, so we could say that at the statement level "<var> = <expr>" and "<var> := <expr>" just mean the same thing, or "=" is a shorthand for ":=", or whatever. In most cases it still makes sense to parenthesize it, since the := operator should have the same priority as the regular assignment operator, which means that "if x := f() and x != 42:" is broken and should be written as "if (x := f()) and x != 42:". But that could be a style guide issue. (Also note that I'm not proposing to add "+:=" etc., even though in C that's supported.) It would still require carefully defining execution order in all cases, but we should probably do that anyway. At some point we could introduce a "block" statement similar to Lua's do/end or C's blocks (also found in many other languages). But there's not really a lot of demand for this -- style guides justly frown upon functions so long that they would benefit much. -- --Guido van Rossum (python.org/~guido)
On Mon, Mar 26, 2018 at 3:34 PM, Guido van Rossum
On Sun, Mar 25, 2018 at 6:29 PM, Chris Angelico
wrote: On Mon, Mar 26, 2018 at 12:24 PM, Guido van Rossum
wrote: The scope question is far from easy though. I find it particularly grating that an inline assignment occurs in an 'if' statement, its scope is the entire body of the 'if'. If that body is two pages long, by the end of it the reader (or even the writer!) may well have lost track of where it was defined and may be confused by the consequence past the end of the body.
I think this one can be given to style guides. The useful situations (eg regex match capturing) are sufficiently valuable that the less-useful ones can just come along for the ride, just like "x = lambda: ..." is perfectly valid even though "def" would be preferable.
Not so fast. There's a perfectly reasonable alternative to sublocal scopes -- just let it assign to a local variable in the containing scope. That's the same as what Python does for for-loop variables. Note that for comprehensions it still happens to do the right thing (assuming we interpret the comprehension's private local scope to be the containing scope).
This alternative has significant advantages in my view -- it doesn't require a whole new runtime mechanism to implement it (in CPython you can just generate bytecode to store and load locals), and it doesn't require a new section in the documentation to explain the new type of variable scope. Also it would make Steven happy. :-)
I'm still liking the sublocal system, but making assignment expressions capable of standing plausibly without them is a Good Thing.
Perhaps we could even remove the requirement to parenthesize the new form of assignment, so we could say that at the statement level "<var> = <expr>" and "<var> := <expr>" just mean the same thing, or "=" is a shorthand for ":=", or whatever. In most cases it still makes sense to parenthesize it, since the := operator should have the same priority as the regular assignment operator, which means that "if x := f() and x != 42:" is broken and should be written as "if (x := f()) and x != 42:". But that could be a style guide issue. (Also note that I'm not proposing to add "+:=" etc., even though in C that's supported.)
That's about where I was thinking of putting it; "test" gets defined potentially as "NAME := test". It has to right-associate. At the moment, I'm going to be restricting it to simple names only, so you can't say "x[1] := 2". That may be changed later. ChrisA
Hi Chris, would you mind to add this syntactic form `(expr -> var)` to alternative syntax section, with the same semantics as `(expr as var)`. It seems to me that I've seen this form previously in some thread (can't find where), but it does not appear in alt. syntax section. As for me this form has several benefits: 1. Currently it is a SyntaxError Really there exist some intersection with syntax to annotate function return type, but it has much smaller impact than `as` variant. 2. This form looks It looks as readable (also the expression comes first, which appeals to me) as the 'as' variant. 3. It is clearly distinguishable from the usual assignment statement (which also appeals to me) Suppose someday someone will want to have a type hint on a local variable (I think sublocal are safer on this part), then: ``` while (x: int := counter): do_some_stuff ``` vs ``` while (counter -> x: int): do_some_stuff ``` Maybe it is too subjective, but the second form looks better for me. taking in all the аdvantages of the `as` form. Also this '->' form can be extended to some sort of tuple unpacking. I don't think that tuple unpacking is a good example, but nevertheless:) And will make further attempts to introduce ` +:=` impossible. With kind regards, -gdg
On Mon, Mar 26, 2018 at 7:14 PM, Kirill Balunov
Hi Chris, would you mind to add this syntactic form `(expr -> var)` to alternative syntax section, with the same semantics as `(expr as var)`. It seems to me that I've seen this form previously in some thread (can't find where), but it does not appear in alt. syntax section.
Can do. I'm in the middle of some large edits, and will try to remember this when I get to that section. If you see another posting of the PEP and I haven't included it, please remind me and I'll add it. ChrisA
On Mon, Mar 26, 2018 at 11:14:43AM +0300, Kirill Balunov wrote:
Hi Chris, would you mind to add this syntactic form `(expr -> var)` to alternative syntax section, with the same semantics as `(expr as var)`. It seems to me that I've seen this form previously in some thread (can't find where), but it does not appear in alt. syntax section.
That was probably my response to Nick: https://mail.python.org/pipermail/python-ideas/2018-March/049472.html I compared four possible choices: target = default if (expression as name) is None else name target = default if (name := expression) is None else name target = default if (expression -> name) is None else name target = default if (name <- expression) is None else name The two arrow assignment operators <- and -> are both taken from R. If we go down the sublocal scope path, which I'm not too keen on, then Nick's earlier comments convince me that we should avoid "as". In that case, my preferences are: (best) -> := <- as (worst) If we just bind to regular locals, then my preferences are: (best) as -> := <- (worst) Preferences are subject to change :-) -- Steve
2018-03-26 14:18 GMT+03:00 Steven D'Aprano
That was probably my response to Nick:
https://mail.python.org/pipermail/python-ideas/2018-March/049472.html
I compared four possible choices:
target = default if (expression as name) is None else name target = default if (name := expression) is None else name target = default if (expression -> name) is None else name target = default if (name <- expression) is None else name
Yes, most likely :)
The two arrow assignment operators <- and -> are both taken from R.
I was also thinking about `<-` variant (but with a Haskell in mind), but with the current Python rules, it seems that it does not fit:
x = 10 (x <- 5) False
With kind regards, -gdg
On Mon, Mar 26, 2018 at 02:42:32PM +0300, Kirill Balunov wrote:
I was also thinking about `<-` variant (but with a Haskell in mind), but with the current Python rules, it seems that it does not fit:
Ah, of course not, the dreaded unary operator strikes again! (I was just chatting with Chris about unary operators off-list earlier today.) -- Steve
On 26/03/18 02:24, Guido van Rossum wrote:
I gotta say I'm warming up to := in preference over 'as',*if* we're going to do this at all (not a foregone conclusion at all).
I have the usual objection to glyphs (hard to look up or get help on), but ":=" raises two issues all of its own. * On the plus side, it looks like some kind of assignment. People reading through the code will not be overly surprised to find it results in a name binding. * On the minus side, it doesn't quite look like an assignment statement. While my crystal ball is cloudy, I can well imagine beginners becoming very confused over which symbol to use in which circumstance, and a lot of swearing when: x := f() if (y = g(x)) is not None: h(y) results in syntax errors. I'm inclined to think you want assignment expressions to look unlike assignment statements to avoid this sort of confusion. -- Rhodri James *-* Kynesim Ltd
On 26 March 2018 at 14:34, Guido van Rossum
Not so fast. There's a perfectly reasonable alternative to sublocal scopes -- just let it assign to a local variable in the containing scope. That's the same as what Python does for for-loop variables. Note that for comprehensions it still happens to do the right thing (assuming we interpret the comprehension's private local scope to be the containing scope).
I finally remembered one of the original reasons that allowing embedded assignment to target regular locals bothered me: it makes named subexpressions public members of the API if you use them at class or module scope. (I sent an off-list email to Chris about that yesterday, so the next update to the PEP is going to take it into account). Similarly, if you use a named subexpression in a generator or coroutine and it gets placed in the regular locals() namespace, then you've now made that reference live for as long as the generator or coroutine does, even if you never need it again. By contrast, the sublocals idea strives to keep the *lifecycle* impact of naming a subexpression as negligible as possible - while a named subexpression might live a little longer than it used to as an anonymous subexpression (or substantially longer in the case of compound statement headers), it still wouldn't survive past the end of the statement where it appeared. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, Mar 26, 2018 at 7:57 AM, Nick Coghlan
On 26 March 2018 at 14:34, Guido van Rossum
wrote: Not so fast. There's a perfectly reasonable alternative to sublocal scopes -- just let it assign to a local variable in the containing scope. That's the same as what Python does for for-loop variables. Note that for comprehensions it still happens to do the right thing (assuming we interpret the comprehension's private local scope to be the containing scope).
I finally remembered one of the original reasons that allowing embedded assignment to target regular locals bothered me: it makes named subexpressions public members of the API if you use them at class or module scope. (I sent an off-list email to Chris about that yesterday, so the next update to the PEP is going to take it into account).
Similarly, if you use a named subexpression in a generator or coroutine and it gets placed in the regular locals() namespace, then you've now made that reference live for as long as the generator or coroutine does, even if you never need it again.
By contrast, the sublocals idea strives to keep the *lifecycle* impact of naming a subexpression as negligible as possible - while a named subexpression might live a little longer than it used to as an anonymous subexpression (or substantially longer in the case of compound statement headers), it still wouldn't survive past the end of the statement where it appeared.
But this is not new: if you use a for-loop to initialize some class-level structure you have the same problem. There is also a standard solution (just 'del' it). -- --Guido van Rossum (python.org/~guido)
On 26/03/2018 16:57, Guido van Rossum wrote:
On Mon, Mar 26, 2018 at 7:57 AM, Nick Coghlan
mailto:ncoghlan@gmail.com> wrote: On 26 March 2018 at 14:34, Guido van Rossum
mailto:guido@python.org> wrote: > Not so fast. There's a perfectly reasonable alternative to sublocal scopes > -- just let it assign to a local variable in the containing scope. That's > the same as what Python does for for-loop variables. Note that for > comprehensions it still happens to do the right thing (assuming we interpret > the comprehension's private local scope to be the containing scope). I finally remembered one of the original reasons that allowing embedded assignment to target regular locals bothered me: it makes named subexpressions public members of the API if you use them at class or module scope. (I sent an off-list email to Chris about that yesterday, so the next update to the PEP is going to take it into account).
Similarly, if you use a named subexpression in a generator or coroutine and it gets placed in the regular locals() namespace, then you've now made that reference live for as long as the generator or coroutine does, even if you never need it again.
By contrast, the sublocals idea strives to keep the *lifecycle* impact of naming a subexpression as negligible as possible - while a named subexpression might live a little longer than it used to as an anonymous subexpression (or substantially longer in the case of compound statement headers), it still wouldn't survive past the end of the statement where it appeared.
But this is not new: if you use a for-loop to initialize some class-level structure you have the same problem. There is also a standard solution (just 'del' it). True. But there is a case for also saying that the for-loop variable's scope should have been limited to the for-loop-suite. (Not that it's feasible to make that change now, of course.)
If I had a time-machine, I would add an assignment character (probably looking something like <- ) to the original ASCII character set. Then "=" means equality - job done. Actually, probably a right-assignment character ( -> ) as well. Rob Cliffe
On Mon, Mar 26, 2018 at 03:33:52PM +0100, Rhodri James wrote:
While my crystal ball is cloudy, I can well imagine beginners becoming very confused over which symbol to use in which circumstance, and a lot of swearing when:
x := f() if (y = g(x)) is not None: h(y)
results in syntax errors.
I remember as a beginner being terribly confused when writing dicts and constantly writing {key=value}. It is part of the learning process, and while we shouldn't intentionally make things harder for beginners just for the sake of making it harder, we shouldn't necessarily give them veto over new features :-) (I must admit that even now, if I'm tired and distracted I occasionally make this same mistake.) But we also have the opportunity to make things easier for them. I presume that the syntax error could diagnose the error and tell them how to fix it: SyntaxError: cannot use := in a stand-alone statement, use = SyntaxError: cannot use = in an assignment expression, use := or similar. Problem solved. -- Steve
On 03/26/2018 04:18 AM, Steven D'Aprano wrote:
On Mon, Mar 26, 2018 at 11:14:43AM +0300, Kirill Balunov wrote:
Hi Chris, would you mind to add this syntactic form `(expr -> var)` to alternative syntax section, with the same semantics as `(expr as var)`. It seems to me that I've seen this form previously in some thread (can't find where), but it does not appear in alt. syntax section.
That was probably my response to Nick:
https://mail.python.org/pipermail/python-ideas/2018-March/049472.html
I compared four possible choices:
target = default if (expression as name) is None else name target = default if (name := expression) is None else name target = default if (expression -> name) is None else name target = default if (name <- expression) is None else name
The two arrow assignment operators <- and -> are both taken from R.
If we go down the sublocal scope path, which I'm not too keen on, then Nick's earlier comments convince me that we should avoid "as". In that case, my preferences are:
(best) -> := <- as (worst)
If we just bind to regular locals, then my preferences are:
(best) as -> := <- (worst)
Preferences are subject to change :-)
Obviously we're bikeshedding here, but personally I detest these kinds of operators. To me - is a minus sign and < and > are less-than and greater-than. Trying to re-use these characters in ways that depend on their visual form strikes me as very ugly. <= makes sense because its meaning is a combination of the *meanings* of < and =, but <- as assignment is not a combination of the meanings of < and -. If we need a two-character operator, I prefer something like := that doesn't try to be a picture.
[Guido]
... Not so fast. There's a perfectly reasonable alternative to sublocal scopes -- just let it assign to a local variable in the containing scope. That's the same as what Python does for for-loop variables.
That's certainly what I would _expect_ if I never read the docs, conditioned by experience with Python's `for` and embedded assignments in at least C and Icon. But I have to confess I already gave up trying to stay up-to-date with all of Python's _current_ scope rules. It's not what I want to think about. It's easier all around to try not to reuse names in clever ways to begin with.
On 27 March 2018 at 01:57, Guido van Rossum
On Mon, Mar 26, 2018 at 7:57 AM, Nick Coghlan
wrote: By contrast, the sublocals idea strives to keep the *lifecycle* impact of naming a subexpression as negligible as possible - while a named subexpression might live a little longer than it used to as an anonymous subexpression (or substantially longer in the case of compound statement headers), it still wouldn't survive past the end of the statement where it appeared.
But this is not new: if you use a for-loop to initialize some class-level structure you have the same problem. There is also a standard solution (just 'del' it).
Right, but that's annoying, too, and adds "Am I polluting a namespace I care about?" to something that would ideally be a purely statement local consideration (and currently is for comprehensions and generator expressions). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2018-03-23 06:01, Chris Angelico wrote:
A suggestion: Under the rejected "Special-casing comprehensions", you show "prefix-local-name-bindings": Name bindings that appear before the loop, like:
stuff = [(y, x/y) where y = f(x) for x in range(5)]
Please add mention of rejecting "postfix-local-name-bindings": Name bindings that happen after the loop. For example:
stuff = [(y, x/y) for x in range(5) where y = f(x)]
Since all the same reasoning applies to both prefix and postfix variations, maybe distinguishing between prefix and postfix can be done in the last paragraph of "Special-casing comprehensions". Thank you.
On Tue, Mar 27, 2018 at 7:00 AM, Nick Coghlan
On 27 March 2018 at 01:57, Guido van Rossum
wrote: On Mon, Mar 26, 2018 at 7:57 AM, Nick Coghlan
wrote: By contrast, the sublocals idea strives to keep the *lifecycle* impact of naming a subexpression as negligible as possible - while a named subexpression might live a little longer than it used to as an anonymous subexpression (or substantially longer in the case of compound statement headers), it still wouldn't survive past the end of the statement where it appeared.
But this is not new: if you use a for-loop to initialize some class-level structure you have the same problem. There is also a standard solution (just 'del' it).
Right, but that's annoying, too, and adds "Am I polluting a namespace I care about?" to something that would ideally be a purely statement local consideration (and currently is for comprehensions and generator expressions).
The standard reply here is that if you can't tell at a glance whether that's the case, your code is too complex. The Zen of Python says "Namespaces are one honking great idea -- let's do more of those!" and in this case that means refactor into smaller namespaces, i.e. functions/methods. -- --Guido van Rossum (python.org/~guido)
On 27/03/2018 16:22, Guido van Rossum wrote:
On Tue, Mar 27, 2018 at 7:00 AM, Nick Coghlan
mailto:ncoghlan@gmail.com> wrote: On 27 March 2018 at 01:57, Guido van Rossum
mailto:guido@python.org> wrote: > On Mon, Mar 26, 2018 at 7:57 AM, Nick Coghlan mailto:ncoghlan@gmail.com> wrote: >> By contrast, the sublocals idea strives to keep the *lifecycle* impact >> of naming a subexpression as negligible as possible - while a named >> subexpression might live a little longer than it used to as an >> anonymous subexpression (or substantially longer in the case of >> compound statement headers), it still wouldn't survive past the end of >> the statement where it appeared. > > > But this is not new: if you use a for-loop to initialize some class-level > structure you have the same problem. There is also a standard solution > (just 'del' it). Right, but that's annoying, too, and adds "Am I polluting a namespace I care about?" to something that would ideally be a purely statement local consideration (and currently is for comprehensions and generator expressions).
The standard reply here is that if you can't tell at a glance whether that's the case, your code is too complex. The Zen of Python says "Namespaces are one honking great idea -- let's do more of those!" and in this case that means refactor into smaller namespaces, i.e. functions/methods.
This is not always satisfactory. If your for-loop uses 20 already-defined-locals, do you want to refactor it into a function with 20 parameters? Regards Rob Cliffe
On Wed, Mar 28, 2018 at 12:08:24AM +0100, Rob Cliffe via Python-ideas wrote:
On 27/03/2018 16:22, Guido van Rossum wrote:
The standard reply here is that if you can't tell at a glance whether that's the case, your code is too complex. The Zen of Python says "Namespaces are one honking great idea -- let's do more of those!" and in this case that means refactor into smaller namespaces, i.e. functions/methods.
This is not always satisfactory. If your for-loop uses 20 already-defined-locals, do you want to refactor it into a function with 20 parameters?
The standard reply here is that if your for-loop needs 20 locals, your function is horribly over-complex and you may need to rethink your design. And if you don't think "20 locals" is too many, okay, how about 50? 100? 1000? At some point we'll all agree that the function is too complex. We don't have an obligation to solve every problem of excess complexity, especially when the nominal solution involves adding complexity elsewhere. For 25 years, the solution to complex functions in Python has been to refactor or simplify them. That strategy has worked well in practice, not withstanding your hypothetical function. If you genuinely do have a function that is so highly coupled with so many locals that it is hard to refactor, then you have my sympathy but we have no obligation to add a band-aid for it to the language. Putting the loop variable in its own scope doesn't do anything about the real problem: you have a loop that needs to work with twenty other local variables. Any other modification to the loop will run into the same problem: you have to check the rest of the function to ensure you're not clobbering one of the twenty other variables. Special-casing the loop variable seems hardly justified. If there is a justification for introducing sub-local scoping, then I think it needs to be something better than pathologically over-complex functions. -- Steve
On 28/03/2018 01:19, Steven D'Aprano wrote:
On Wed, Mar 28, 2018 at 12:08:24AM +0100, Rob Cliffe via Python-ideas wrote:
On 27/03/2018 16:22, Guido van Rossum wrote:
The standard reply here is that if you can't tell at a glance whether that's the case, your code is too complex. The Zen of Python says "Namespaces are one honking great idea -- let's do more of those!" and in this case that means refactor into smaller namespaces, i.e. functions/methods.
This is not always satisfactory. If your for-loop uses 20 already-defined-locals, do you want to refactor it into a function with 20 parameters? The standard reply here is that if your for-loop needs 20 locals, your function is horribly over-complex and you may need to rethink your design.
And if you don't think "20 locals" is too many, okay, how about 50? 100? 1000? At some point we'll all agree that the function is too complex.
We don't have an obligation to solve every problem of excess complexity, especially when the nominal solution involves adding complexity elsewhere.
For 25 years, the solution to complex functions in Python has been to refactor or simplify them. That strategy has worked well in practice, not withstanding your hypothetical function.
If you genuinely do have a function that is so highly coupled with so many locals that it is hard to refactor, then you have my sympathy but we have no obligation to add a band-aid for it to the language. It's a fact of life that some tasks *are* complicated. I daresay most aren't, or don't need to be, but some are.
Putting the loop variable in its own scope doesn't do anything about the real problem: you have a loop that needs to work with twenty other local variables. Any other modification to the loop will run into the same problem: you have to check the rest of the function to ensure you're not clobbering one of the twenty other variables. Special-casing the loop variable seems hardly justified.
If there is a justification for introducing sub-local scoping, then I think it needs to be something better than pathologically over-complex functions.
But putting the loop variable in its own scope solves one problem: it ensures that the variable is confined to that loop, and you don't have to worry about whether a variable of the same name occurs elsewhere in your function. In other words it increases local transparency (I'm not sure that's the right phrase, but I'm struggling to bring a more appropriate one to mind) and hence increases readability. (I understand your point about being able to inspect the for-loop variable after the for-loop has terminated - I've probably done it myself - but it's a matter of opinion whether that convenience outweighs the cleanliness of confining the for-variable's scope.) Regards Rob Cliffe
This thread is dead. On Tue, Mar 27, 2018 at 5:40 PM, Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
On 28/03/2018 01:19, Steven D'Aprano wrote:
On Wed, Mar 28, 2018 at 12:08:24AM +0100, Rob Cliffe via Python-ideas wrote:
On 27/03/2018 16:22, Guido van Rossum wrote:
The standard reply here is that if you can't tell at a glance whether that's the case, your code is too complex. The Zen of Python says "Namespaces are one honking great idea -- let's do more of those!" and in this case that means refactor into smaller namespaces, i.e. functions/methods.
This is not always satisfactory. If your for-loop uses 20 already-defined-locals, do you want to refactor it into a function with 20 parameters?
The standard reply here is that if your for-loop needs 20 locals, your function is horribly over-complex and you may need to rethink your design.
And if you don't think "20 locals" is too many, okay, how about 50? 100? 1000? At some point we'll all agree that the function is too complex.
We don't have an obligation to solve every problem of excess complexity, especially when the nominal solution involves adding complexity elsewhere.
For 25 years, the solution to complex functions in Python has been to refactor or simplify them. That strategy has worked well in practice, not withstanding your hypothetical function.
If you genuinely do have a function that is so highly coupled with so many locals that it is hard to refactor, then you have my sympathy but we have no obligation to add a band-aid for it to the language.
It's a fact of life that some tasks *are* complicated. I daresay most aren't, or don't need to be, but some are.
Putting the loop variable in its own scope doesn't do anything about the real problem: you have a loop that needs to work with twenty other local variables. Any other modification to the loop will run into the same problem: you have to check the rest of the function to ensure you're not clobbering one of the twenty other variables. Special-casing the loop variable seems hardly justified.
If there is a justification for introducing sub-local scoping, then I think it needs to be something better than pathologically over-complex functions.
But putting the loop variable in its own scope solves one problem: it
ensures that the variable is confined to that loop, and you don't have to worry about whether a variable of the same name occurs elsewhere in your function. In other words it increases local transparency (I'm not sure that's the right phrase, but I'm struggling to bring a more appropriate one to mind) and hence increases readability. (I understand your point about being able to inspect the for-loop variable after the for-loop has terminated - I've probably done it myself - but it's a matter of opinion whether that convenience outweighs the cleanliness of confining the for-variable's scope.) Regards Rob Cliffe
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
On 03/23/2018 03:01 AM, Chris Angelico wrote:
Apologies for letting this languish; life has an annoying habit of getting in the way now and then.
My simple response to all of this is that it's not worth it. Each new example convinces me more and more that in almost every case, sublocal assignments DECREASE readability as long as they occur inline. If the statement is very simple, the sublocal assignments make it complex. If it is complex, they do not aid in seeing parallelism between different pieces that reuse the same value, because the sublocal assignment itself creates an asymmetry. The only alternatives that I see as increasing readability are the "rejected" alternatives in which the sublocal assignment is moved "out of order" so that all references to it look the same and are separated from the (single) assignment --- i.e., the variants of the form "x = a+b with a='foo', b='bar'". (I think someone already mentioned this, but these variants, even if rejected, probably shouldn't be placed under the header of "special-casing comprehensions". Extracting the assignment to a with-clause makes sense outside of comprehensions too. It would make more sense to label them as "out of order" or "non-inline" or perhaps "cleft assignment", by analogy with cleft constructions in natural language.)
On Mar 25 2018, Guido van Rossum
I gotta say I'm warming up to := in preference over 'as', *if* we're going to do this at all (not a foregone conclusion at all).
I'm surprised that no one has mentioned it yet, so as a quick datapoint: Go also uses := for assignment, so there's some precedent. Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«
On Fri, Mar 30, 2018 at 12:04 PM, Nikolaus Rath
On Mar 25 2018, Guido van Rossum
wrote: I gotta say I'm warming up to := in preference over 'as', *if* we're going to do this at all (not a foregone conclusion at all).
I'm surprised that no one has mentioned it yet, so as a quick datapoint: Go also uses := for assignment, so there's some precedent.
It's irrelevant, because Go's solution for inline assignment is entirely different. (And there was no question that := is commonly used for assignment -- just look it up on Wikipedia.) -- --Guido van Rossum (python.org/~guido)
Yes, I first came across := when learning (Turbo) Pascal in the early 90's. However golang managed to screw it up—it only works there as a "short declaration AND assignment" operator. You can't use it twice on the same variable! Boggles the mind how experienced designers came up with that one. ;-) Maybe Algol did it that way? (before my time) I found Pascal's syntax, := for assignment, = and <>, for tests about close to perfect in ease of learning/comprehension as it gets, from someone who studied math before C anyway. -Mike On 2018-03-30 12:04, Nikolaus Rath wrote:
IIRC Algol-68 (the lesser-known, more complicated version) used 'int x =
0;' to declare a constant and 'int x := 0;' to declare a variable. And
there was a lot more to it; see
https://en.wikipedia.org/wiki/ALGOL_68#mode:_Declarations. I'm guessing Go
reversed this because they want '=' to be the common assignment (whereas in
Algol-68 the common assignment was ':=').
My current thinking about Python is that if we're doing this, '=' and ':='
will mean the same thing but inside an expression you must use ':='. Chris,
Nick and I are working out some details off-list.
On Mon, Apr 2, 2018 at 1:51 PM, Mike Miller
Yes, I first came across := when learning (Turbo) Pascal in the early 90's.
However golang managed to screw it up—it only works there as a "short declaration AND assignment" operator. You can't use it twice on the same variable! Boggles the mind how experienced designers came up with that one. ;-) Maybe Algol did it that way? (before my time)
I found Pascal's syntax, := for assignment, = and <>, for tests about close to perfect in ease of learning/comprehension as it gets, from someone who studied math before C anyway.
-Mike
On 2018-03-30 12:04, Nikolaus Rath wrote: _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
On 23 March 2018 at 20:01, Chris Angelico
Apologies for letting this languish; life has an annoying habit of getting in the way now and then.
Feedback from the previous rounds has been incorporated. From here, the most important concern and question is: Is there any other syntax or related proposal that ought to be mentioned here? If this proposal is rejected, it should be rejected with a full set of alternatives.
I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one: # Dict display data = { key_a: 1, key_b: 2, key_c: 3, } # Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # List display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } # Dict display with local key name bindings data = { local_a := key_a: 1, local_b := key_b: 2, local_c := key_c: 3, } I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics. Cheers, Nick. P.S. The specific test case is one where I want to test the three different ways of spelling "the current directory" in some sys.path manipulation code (the empty string, os.curdir, and os.getcwd()), and it occurred to me that a version of PEP 572 that omits the sublocal scoping concept will allow inline naming of parts of data structures as you define them. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Apr 08, 2018 at 09:25:33PM +1000, Nick Coghlan wrote:
I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one:
# Dict display data = { key_a: 1, key_b: 2, key_c: 3, }
# Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, }
I don't understand the point of these examples. Sure, I guess they would be legal, but unless you're actually going to use the name bindings, what's the point in defining them? data = { 1, (spam := complex_expression), spam+1, spam*2, } which I think is cleaner than the existing alternative of defining spam outside of the set. And for dicts: d = { 'key': 'value', (spam := calculated_key): (eggs := calculated_value), spam.lower(): eggs.upper(), }
I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics.
If we really wanted to keep the sublocal scoping, we could make list/set/dict displays their own scope too. Personally, that's the only argument for sublocal scoping that I like yet: what happens inside a display should remain inside the display, and not leak out into the function. So that has taken me from -1 on sublocal scoping to -0.5 if it applies to displays. -- Steve
On Sun, Apr 8, 2018 at 8:01 AM, Steven D'Aprano
If we really wanted to keep the sublocal scoping, we could make list/set/dict displays their own scope too.
Personally, that's the only argument for sublocal scoping that I like yet: what happens inside a display should remain inside the display, and not leak out into the function.
That sounds like a reasonable proposal that we could at least consider. But I think it will not fly. Presumably it doesn't apply to tuple displays, because of reasonable examples like ((a := f(), a+1), a+2), and because it would create an ugly discontinuity between (a := f()) and (a := f(),). But then switching between [a := f(), a] and (a := f(), a) would create a discontinuity. For comprehensions and generator expressions there is no such discontinuity in the new proposal, since these *already* introduce their own scope. -- --Guido van Rossum (python.org/~guido)
# Dict display
data = {
key_a: local_a := 1,
key_b: local_b := 2,
key_c: local_c := 3,
}
Isn’t this a set display with local assignments and type annotations? :o)
(I’m -1 on all of these ideas, btw. None help readability for me, and I read much more code than I write.)
Top-posted from my Windows phone
From: Nick Coghlan
Sent: Sunday, April 8, 2018 6:27
To: Chris Angelico
Cc: python-ideas
Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings,take three!
On 23 March 2018 at 20:01, Chris Angelico
Apologies for letting this languish; life has an annoying habit of getting in the way now and then.
Feedback from the previous rounds has been incorporated. From here, the most important concern and question is: Is there any other syntax or related proposal that ought to be mentioned here? If this proposal is rejected, it should be rejected with a full set of alternatives.
I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one: # Dict display data = { key_a: 1, key_b: 2, key_c: 3, } # Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # List display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } # Dict display with local key name bindings data = { local_a := key_a: 1, local_b := key_b: 2, local_c := key_c: 3, } I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics. Cheers, Nick. P.S. The specific test case is one where I want to test the three different ways of spelling "the current directory" in some sys.path manipulation code (the empty string, os.curdir, and os.getcwd()), and it occurred to me that a version of PEP 572 that omits the sublocal scoping concept will allow inline naming of parts of data structures as you define them. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 9 April 2018 at 01:01, Steven D'Aprano
On Sun, Apr 08, 2018 at 09:25:33PM +1000, Nick Coghlan wrote:
I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one:
# Dict display data = { key_a: 1, key_b: 2, key_c: 3, }
# Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, }
I don't understand the point of these examples. Sure, I guess they would be legal, but unless you're actually going to use the name bindings, what's the point in defining them?
That *would* be the point. In the case where it occurred to me, the actual code I'd written looked like this: curdir_import = "" curdir_relative = os.curdir curdir_absolute = os.getcwd() all_spellings = [curdir_import, curdir_relative, curdir_absolute] (Since I was testing the pydoc CLI's sys.path manipulation, and wanted to cover all the cases).
I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics.
If we really wanted to keep the sublocal scoping, we could make list/set/dict displays their own scope too.
Personally, that's the only argument for sublocal scoping that I like yet: what happens inside a display should remain inside the display, and not leak out into the function.
So that has taken me from -1 on sublocal scoping to -0.5 if it applies to displays.
Inflicting the challenges that comprehensions have at class scope on all container displays wouldn't strike me as a desirable outcome (plus there's also the problem that full nested scopes are relatively expensive at runtime). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
If anyone is interested I came across this same subject on a blog post and discussion on HN today: - https://www.hillelwayne.com/post/equals-as-assignment/ - https://news.ycombinator.com/item?id=16803874 On 2018-04-02 15:03, Guido van Rossum wrote:
IIRC Algol-68 (the lesser-known, more complicated version) used 'int x = 0;' to declare a constant and 'int x := 0;' to declare a variable. And there was a lot more to it; see https://en.wikipedia.org/wiki/ALGOL_68#mode:_Declarations. I'm guessing Go reversed this because they want '=' to be the common assignment (whereas in Algol-68 the common assignment was ':=').
On Wed, Apr 11, 2018 at 1:15 PM, Mike Miller
If anyone is interested I came across this same subject on a blog post and discussion on HN today:
- https://www.hillelwayne.com/post/equals-as-assignment/ - https://news.ycombinator.com/item?id=16803874
Those people who say "x = x + 1" makes no sense... do they also get confused by the fact that you can multiply a string by a number? Programming is not algebra. The ONLY reason that "x = x + 1" can fail to make sense is if you start by assuming that there is no such thing as time. That's the case in algebra, but it simply isn't true in software. Functional programming languages are closer to algebra than imperative languages are, but that doesn't mean they _are_ algebraic, and they go to great lengths to lie about how you can have side-effect-free side effects and such. Fortunately, Python is not bound by such silly rules, and can do things because they are useful for real-world work. Thus the question of ":=" vs "=" vs "==" vs "===" comes down to what is actually worth doing, not what would look tidiest to someone who is trying to represent a mathematician's blackboard in ASCII. ChrisA
On 2018-04-11 04:15, Mike Miller wrote:
If anyone is interested I came across this same subject on a blog post and discussion on HN today:
It says "BCPL also introduced braces as a means of defining blocks.". That bit is wrong, unless "braces" is being used as a generic term. BCPL used $( and $).
participants (18)
-
BrenBarn
-
Chris Angelico
-
Eric V. Smith
-
Ethan Furman
-
Guido van Rossum
-
Kirill Balunov
-
Kyle Lahnakoski
-
Mike Miller
-
MRAB
-
Nick Coghlan
-
Nikolaus Rath
-
Paul Moore
-
Rhodri James
-
Rob Cliffe
-
Steve Dower
-
Steven D'Aprano
-
Tim Peters
-
Zero Piraeus