Re: [Python-ideas] Coming up with an alternative to PEP 505's None-aware operators

I don't know...to me this looks downright ugly and an awkward special case. It feels like it combines reading difficulty of inline assignment with the awkwardness of a magic word and the ugliness of using ?. Basically, every con of the other proposals combined... -- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttps://refi64.com/ On Feb 15, 2018 at 8:07 PM, <Nick Coghlan <ncoghlan@gmail.com>> wrote: The recent thread on variable assignment in comprehensions has prompted me to finally share https://gist.github.com/ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 with a wider audience (see the comments there for some notes on iterations I've already been through on the idea). == The general idea == The general idea would be to introduce a *single* statement local reference using a new keyword with a symbolic prefix: "?it" * `(?it=expr)` is a new atomic expression for an "it reference binding" (whitespace would be permitted around "?it" and "=", but PEP 8 would recommend against it in general) * subsequent subexpressions (in execution order) can reference the bound subexpression using `?it` (an "it reference") * `?it` is reset between statements, including before entering the suite within a compound statement (if you want a persistent binding, use a named variable) * for conditional expressions, put the reference binding in the conditional, as that gets executed first * to avoid ambiguity, especially in function calls (where it could be confused with keyword argument syntax), the parentheses around reference bindings are always required * unlike regular variables, you can't close over statement local references (the nested scope will get an UnboundLocalError if you try it) The core inspiration here is English pronouns (hence the choice of keyword): we don't generally define arbitrary terms in the middle of sentences, but we *do* use pronouns to refer back to concepts introduced earlier in the sentence. And while it's not an especially common practice, pronouns are sometimes even used in a sentence *before* the concept they refer to ;) If we did pursue this, then PEPs 505, 532, and 535 would all be withdrawn or rejected (with the direction being to use an it-reference instead). == Examples == `None`-aware attribute access: value = ?it.strip()[4:].upper() if (?it=var1) is not None else None `None`-aware subscript access: value = ?it[4:].upper() if (?it=var1) is not None else None `None`-coalescense: value = ?it if (?it=var1) is not None else ?it if (?it=var2) is not None else var3 `NaN`-coalescence: value = ?it if not math.isnan((?it=var1)) else ?it if not math.isnan((?that=var2)) else var3 Conditional function call: value = ?it() if (?it=calculate) is not None else default Avoiding repeated evaluation of a comprehension filter condition: filtered_values = [?it for x in keys if (?it=get_value(x)) is not None] Avoiding repeated evaluation for range and slice bounds: range((?it=calculate_start()), ?it+10) data[(?it=calculate_start()):?it+10] Avoiding repeated evaluation in chained comparisons: value if (?it=lower_bound()) <= value < ?it+tolerance else 0 Avoiding repeated evaluation in an f-string: print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} and in Unicode as {?it}" == Possible future extensions == One possible future extension would be to pursue PEP 3150, treating the nested namespace as an it reference binding, giving: sorted_data = sorted(data, key=?it.sort_key) given ?it=: def sort_key(item): return item.attr1, item.attr2 (A potential bonus of that spelling is that it may be possible to make "given ?it=:" the syntactic keyword introducing the suite, allowing "given" itself to continue to be used as a variable name) Another possible extension would be to combine it references with `as` clauses on if statements and while loops: if (?it=pattern.match(data)) is not None as matched: ... while (?it=pattern.match(data)) is not None as matched: ... == Why not arbitrary embedded assignments? == Primarily because embedded assignments are inherently hard to read, especially in long expressions. Restricting things to one pronoun, and then pursuing PEP 3150's given clause in order to expand to multiple statement local names should help nudge folks towards breaking things up into multiple statements rather than writing ever more complex one-liners. That said, the ?-prefix notation is deliberately designed such that it *could* be used with arbitrary identifiers rather then being limited to a single specific keyword, and the explicit lack of closure support means that there wouldn't be any complex nested scope issues associated with lambda expressions, generator expressions, or container comprehensions. With that approach, "?it" would just be an idiomatic default name like "self" or "cls" rather than being a true keyword. Given arbitrary identifier support, some of the earlier examples might instead be written as: value = ?f() if (?f=calculate) is not None else default range((?start=calculate_start()), ?start+10) value if (?lower=lower_bound()) <= value < ?lower+tolerance else 0 The main practical downside to this approach is that *all* the semantic weight ends up resting on the symbolic "?" prefix, which makes it very difficult to look up as a new Python user. With a keyword embedded in the construct, there's a higher chance that folks will be able to guess the right term to search for (i.e. "python it expression" or "python it keyword"). Another downside of this more flexible option is that it likely *wouldn't* be amenable to the "if expr as name:" syntax extension, as there wouldn't be a single defined pronoun expression to bind the name to. However, the extension to PEP 3150 would allow the statement local namespace to be given an arbitrary name: sorted_data = sorted(data, key=?ns.sort_key) given ?ns=: def sort_key(item): return item.attr1, item.attr2 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 16 February 2018 at 12:19, rymg19@gmail.com <rymg19@gmail.com> wrote:
Yeah, it's tricky to find a spelling that looks nice without being readily confusable with other existing constructs (most notably keyword arguments). The cleanest *looking* idea I've come up with would be to allow arbitrary embedded assignments to ordinary frame local variables using the "(expr as name)" construct: value = tmp.strip()[4:].upper() if (var1 as tmp) is not None else None value = tmp[4:].upper() if (var1 as tmp) is not None else None value = tmp if (var1 as tmp) is not None else tmp if (var2 as tmp) is not None else var3 value = tmp if not math.isnan((var1 as tmp)) else tmp if not math.isnan((var2 as tmp)) else var3 value = f() if (calculate as f) is not None else default filtered_values = [val for x in keys if (get_value(x) as val) is not None] range((calculate_start() as start), start+10) data[(calculate_start() as start):start+10] value if (lower_bound() as min_val) <= value < min_val+tolerance else 0 print(f"{(get_value() as tmp)!r} is printed in pure ASCII as {tmp!a} and in Unicode as {tmp}") However, while I think that looks nicer in general, we'd still have to choose between two surprising behaviours: * implicitly delete the statement locals after the statement where they're set (which still overwrites any values previously bound to those names, similar to what happens with exception clauses) * skip deleting, which means references to subexpressions may last longer than expected (and we'd have the problem where embedded assignments could overwrite existing local variables) The interaction with compound statements would also be tricky to figure out (especially if we went with the "delete after the statement" behaviour). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

What about (| val = get_value(x) |) assignment expression which will be True if success, and None if not? So it will be value = f() if (| f = calculate |) else default…The idea is inspired from C’s assignment, but needs some special treatment for anything which is False in boolean context. With kind regards, -gdg 2018-02-16 10:55 GMT+03:00 Nick Coghlan <ncoghlan@gmail.com>:

On 16 February 2018 at 18:36, Kirill Balunov <kirillbalunov@gmail.com> wrote:
If we're going to allow arbitrary embedded assignments, then "(expr as name)" is the most likely spelling, since: * "as" is already a keyword * "expr as name" is already used for name binding related purposes (albeit not for simple assignments) * "python as expression" and "python as keyword" are both things search engines will accept as queries (search engines tend not to cope very well when you try to search for punctuation characters) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 02/15/2018 11:55 PM, Nick Coghlan wrote:
On 16 February 2018 at 12:19, rymg19@gmail.com wrote:
-1 to ?it +1 to (name as expr)
If we're overwriting locals anyway, don't delete it. The good reason for unsetting an exception variable doesn't apply here.
Odds are good that we'll want/need that assignment even after the immediate expression it's used in. Let it stick around. -- ~Ethan~

On 17 February 2018 at 02:31, Ethan Furman <ethan@stoneleaf.us> wrote:
If we want to use a subexpression in multiple statements, then regular assignment statements already work fine - breaking out a separate variable assignment only feels like an inconvenience when we use a subexpression twice in a single statement, and then don't need it any further. By contrast, if we have an implicit del immediately after the statement for any statement local variables, then naming a subexpression only extends its life to the end of the statement, not to the end of the current function, and it's semantically explicit that you *can't* use statement locals to name subexpressions that *aren't* statement local. The other concern I have with any form of statement local variables that can overwrite regular locals is that we'd be reintroducing the problem that comprehensions have in Python 2.x: unexpectedly rebinding things in non-obvious ways. At least with an implicit "del" the error would be more readily apparent, and if we disallow closing over statement local variables (which would be reasonable, since closures aren't statement local either), then we can avoid interfering with regular locals without needing to introduce a new execution scope. So let's consider a spec for statement local variable semantics that looks like this: 1. Statement local variables do *not* appear in locals() 2. Statement local variables are *not* visible in nested scopes (of any kind) 3. Statement local variables in compound statement headers are visible in the body of that compound statement 4. Due to 3, statement local variable references will be syntactically distinct from regular local variable references 5. Nested uses of the same statement local variable name will shadow outer uses, rather than overwriting them The most discreet syntactic marker we have available is a single leading dot, which would allow the following (note that in the simple assignment cases, breaking out a preceding assignment would be easy, but the perk of the statement local spelling is that it works in *any* expression context): value = .s.strip()[4:].upper() if (var1 as .s) is not None else None value = .s[4:].upper() if (var1 as .s) is not None else None value = .v if (var1 as .v) is not None else .v if (var2 as .v) is not None else var3 value = .v if not math.isnan((var1 as .v)) else tmp if not math.isnan((var2 as .v)) else var3 value = .f() if (calculate as .f) is not None else default filtered_values = [.v for x in keys if (get_value(x) as .v) is not None] range((calculate_start() as .start), .start+10) data[(calculate_start() as .start):.start+10] value if (lower_bound() as .min_val) <= value < .min_val+tolerance else 0 print(f"{(get_value() as .v)!r} is printed in pure ASCII as {.v!a} and in Unicode as {.v}") if (pattern.search(data) as .m) is not None: # .m is available here as the match result else: # .m is also available here (but will always be None given the condition) # .m is no longer available here Except clauses would be updated to allow the "except ExceptionType as .exc" spelling, which would give full statement local semantics (i.e. disallow closing over the variable, hide it from locals), rather than just deleting it at the end of the clause execution. Similarly, with statements would allow "with cm as .enter_result" to request statement local semantics for the enter result. (One potential concern here would be the not-immediately-obvious semantic difference between "with (cm as .the_cm):" and "with cm as .enter_result:"). To make that work at an implementation level we'd then need to track the following in the compiler: * the current nested statement level in the current compilation (so we can generate distinct names at each level) * a per-statement set of local variable names (so we can clear them at the end of the statement) * the largest number of concurrently set statement local variables (so we can allocate space for them in the frame) * the storage offset to use for each statement local variable and then frames would need an additional storage area for statement locals, as well as new opcodes for accessing them. Adding yet more complexity to an already complicated scoping model is an inherently dubious proposal, but at the same time, it does provide a way to express "and" and "or" semantics in terms of statement local variables and conditional expressions, and comparison chaining in terms of statement local variables and the "and" operator (so conceptually this kind of primitive does already exist in the language, just only as an operator-specific special case inside the interpreter). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 02/18/2018 05:57 PM, Nick Coghlan wrote:
On 17 February 2018 at 02:31, Ethan Furman wrote:
On 02/15/2018 11:55 PM, Nick Coghlan wrote:
Good points. I see two possibly good solutions: - don't override existing local variables, implicit del after statement - override existing local variables, no implicit del after statement I like the first one better, as it mirrors list comprehensions and is simple to understand. The second one is okay, and if significantly easier to implement I would be okay with. It's the combination of: - override existing local variables, implicit del after statement that I abhor. As I understand it, only try/except has that behavior -- and we have a really good reason for it, and it's the exception to the general rule, and... Okay, after further thought I like the second one better. List comps have the outside brackets as a reminder that they have their own scope, but these "statement local" variables only have internal parenthesis. I still don't like the third option. -- ~Ethan~

On 16 February 2018 at 12:19, rymg19@gmail.com <rymg19@gmail.com> wrote:
Yeah, it's tricky to find a spelling that looks nice without being readily confusable with other existing constructs (most notably keyword arguments). The cleanest *looking* idea I've come up with would be to allow arbitrary embedded assignments to ordinary frame local variables using the "(expr as name)" construct: value = tmp.strip()[4:].upper() if (var1 as tmp) is not None else None value = tmp[4:].upper() if (var1 as tmp) is not None else None value = tmp if (var1 as tmp) is not None else tmp if (var2 as tmp) is not None else var3 value = tmp if not math.isnan((var1 as tmp)) else tmp if not math.isnan((var2 as tmp)) else var3 value = f() if (calculate as f) is not None else default filtered_values = [val for x in keys if (get_value(x) as val) is not None] range((calculate_start() as start), start+10) data[(calculate_start() as start):start+10] value if (lower_bound() as min_val) <= value < min_val+tolerance else 0 print(f"{(get_value() as tmp)!r} is printed in pure ASCII as {tmp!a} and in Unicode as {tmp}") However, while I think that looks nicer in general, we'd still have to choose between two surprising behaviours: * implicitly delete the statement locals after the statement where they're set (which still overwrites any values previously bound to those names, similar to what happens with exception clauses) * skip deleting, which means references to subexpressions may last longer than expected (and we'd have the problem where embedded assignments could overwrite existing local variables) The interaction with compound statements would also be tricky to figure out (especially if we went with the "delete after the statement" behaviour). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

What about (| val = get_value(x) |) assignment expression which will be True if success, and None if not? So it will be value = f() if (| f = calculate |) else default…The idea is inspired from C’s assignment, but needs some special treatment for anything which is False in boolean context. With kind regards, -gdg 2018-02-16 10:55 GMT+03:00 Nick Coghlan <ncoghlan@gmail.com>:

On 16 February 2018 at 18:36, Kirill Balunov <kirillbalunov@gmail.com> wrote:
If we're going to allow arbitrary embedded assignments, then "(expr as name)" is the most likely spelling, since: * "as" is already a keyword * "expr as name" is already used for name binding related purposes (albeit not for simple assignments) * "python as expression" and "python as keyword" are both things search engines will accept as queries (search engines tend not to cope very well when you try to search for punctuation characters) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 02/15/2018 11:55 PM, Nick Coghlan wrote:
On 16 February 2018 at 12:19, rymg19@gmail.com wrote:
-1 to ?it +1 to (name as expr)
If we're overwriting locals anyway, don't delete it. The good reason for unsetting an exception variable doesn't apply here.
Odds are good that we'll want/need that assignment even after the immediate expression it's used in. Let it stick around. -- ~Ethan~

On 17 February 2018 at 02:31, Ethan Furman <ethan@stoneleaf.us> wrote:
If we want to use a subexpression in multiple statements, then regular assignment statements already work fine - breaking out a separate variable assignment only feels like an inconvenience when we use a subexpression twice in a single statement, and then don't need it any further. By contrast, if we have an implicit del immediately after the statement for any statement local variables, then naming a subexpression only extends its life to the end of the statement, not to the end of the current function, and it's semantically explicit that you *can't* use statement locals to name subexpressions that *aren't* statement local. The other concern I have with any form of statement local variables that can overwrite regular locals is that we'd be reintroducing the problem that comprehensions have in Python 2.x: unexpectedly rebinding things in non-obvious ways. At least with an implicit "del" the error would be more readily apparent, and if we disallow closing over statement local variables (which would be reasonable, since closures aren't statement local either), then we can avoid interfering with regular locals without needing to introduce a new execution scope. So let's consider a spec for statement local variable semantics that looks like this: 1. Statement local variables do *not* appear in locals() 2. Statement local variables are *not* visible in nested scopes (of any kind) 3. Statement local variables in compound statement headers are visible in the body of that compound statement 4. Due to 3, statement local variable references will be syntactically distinct from regular local variable references 5. Nested uses of the same statement local variable name will shadow outer uses, rather than overwriting them The most discreet syntactic marker we have available is a single leading dot, which would allow the following (note that in the simple assignment cases, breaking out a preceding assignment would be easy, but the perk of the statement local spelling is that it works in *any* expression context): value = .s.strip()[4:].upper() if (var1 as .s) is not None else None value = .s[4:].upper() if (var1 as .s) is not None else None value = .v if (var1 as .v) is not None else .v if (var2 as .v) is not None else var3 value = .v if not math.isnan((var1 as .v)) else tmp if not math.isnan((var2 as .v)) else var3 value = .f() if (calculate as .f) is not None else default filtered_values = [.v for x in keys if (get_value(x) as .v) is not None] range((calculate_start() as .start), .start+10) data[(calculate_start() as .start):.start+10] value if (lower_bound() as .min_val) <= value < .min_val+tolerance else 0 print(f"{(get_value() as .v)!r} is printed in pure ASCII as {.v!a} and in Unicode as {.v}") if (pattern.search(data) as .m) is not None: # .m is available here as the match result else: # .m is also available here (but will always be None given the condition) # .m is no longer available here Except clauses would be updated to allow the "except ExceptionType as .exc" spelling, which would give full statement local semantics (i.e. disallow closing over the variable, hide it from locals), rather than just deleting it at the end of the clause execution. Similarly, with statements would allow "with cm as .enter_result" to request statement local semantics for the enter result. (One potential concern here would be the not-immediately-obvious semantic difference between "with (cm as .the_cm):" and "with cm as .enter_result:"). To make that work at an implementation level we'd then need to track the following in the compiler: * the current nested statement level in the current compilation (so we can generate distinct names at each level) * a per-statement set of local variable names (so we can clear them at the end of the statement) * the largest number of concurrently set statement local variables (so we can allocate space for them in the frame) * the storage offset to use for each statement local variable and then frames would need an additional storage area for statement locals, as well as new opcodes for accessing them. Adding yet more complexity to an already complicated scoping model is an inherently dubious proposal, but at the same time, it does provide a way to express "and" and "or" semantics in terms of statement local variables and conditional expressions, and comparison chaining in terms of statement local variables and the "and" operator (so conceptually this kind of primitive does already exist in the language, just only as an operator-specific special case inside the interpreter). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 02/18/2018 05:57 PM, Nick Coghlan wrote:
On 17 February 2018 at 02:31, Ethan Furman wrote:
On 02/15/2018 11:55 PM, Nick Coghlan wrote:
Good points. I see two possibly good solutions: - don't override existing local variables, implicit del after statement - override existing local variables, no implicit del after statement I like the first one better, as it mirrors list comprehensions and is simple to understand. The second one is okay, and if significantly easier to implement I would be okay with. It's the combination of: - override existing local variables, implicit del after statement that I abhor. As I understand it, only try/except has that behavior -- and we have a really good reason for it, and it's the exception to the general rule, and... Okay, after further thought I like the second one better. List comps have the outside brackets as a reminder that they have their own scope, but these "statement local" variables only have internal parenthesis. I still don't like the third option. -- ~Ethan~
participants (4)
-
Ethan Furman
-
Kirill Balunov
-
Nick Coghlan
-
rymg19@gmail.com