A comprehension scope issue in PEP 572

In a different thread I noted that I sometimes want to write code like this: while any(n % p == 0 for p in small_primes): # divide p out - but what's p? But generator expressions hide the value of `p` that succeeded, so I can't. `any()` and `all()` can't address this themselves - they merely work with an iterable of objects to evaluate for truthiness, and know nothing about how they're computed. If you want to identify a witness (for `any()` succeeding) or a counterexample (for `all()` failing), you need to write a workalike loop by hand. So would this spelling work using binding expressions? while any(n % (thisp := p) == 0 for p in small_primes): n //= thisp I'm not entirely clear from what the PEP says, but best guess is "no", from this part of the discussion[1]: """ It would be convenient to use this feature to create rolling or self-effecting data streams: progressive_sums = [total := total + value for value in data] This will fail with UnboundLocalError due to total not being initalized. Simply initializing it outside of the comprehension is insufficient - unless the comprehension is in class scope: ... """ So, in my example above, I expect that `thisp` is viewed as being local to the created-by-magic lexically nested function implementing the generator expression. `thisp` would be bound on each iteration, but would vanish when `any()` finished and the anonymous function vanished with it. I'd get a NameError on "n //= thisp" (or pick up whatever object it was bound to before the loop). I have a long history of arguing that magically created lexically nested anonymous functions try too hard to behave exactly like explicitly typed lexically nested functions, but that's the trendy thing to do so I always lose ;-) The problem: in a magically created nested function, you have no possibility to say _anything_ about scope; at least when you type it by hand, you can add `global` and/or `nonlocal` declarations to more-or-less say what you want. Since there's no way to explicitly identify the desired scope, I suggest that ":=" inside magically created nested functions do the more-useful-more-often thing: treat the name being bound as if the binding had been spelled in its enclosing context instead. So, in the above, if `thisp` was declared `global`, also `global` in the genexp; if `nonlocal`, also `nonlocal`; else (almost always the case in real life) local to the containing code (meaning it would be local to the containing code, but nonlocal in the generated function). Then my example would work fine, and the PEP's would too just by adding total = 0 before it. Funny: before `nonlocal` was added, one of the (many) alternative suggestions was that binding a name in an enclosing scope use ":=" instead of "=". No, I didn't have much use for `for` target names becoming magically local to invisible nested functions either, but I appreciate that it's less surprising overall. Using ":=" is much more strongly screaming "I'm going way out of my way to give a name to this thing, so please don't fight me by assuming I need to be protected from the consequences of what I explicitly asked for". [1] https://www.python.org/dev/peps/pep-0572/

On Mon, May 7, 2018 at 11:32 AM, Tim Peters <tim.peters@gmail.com> wrote:
You're correct. The genexp is approximately equivalent to: def genexp(): for p in small_primes: thisp = p yield n % thisp == 0 while any(genexp()): n //= thisp With generator expressions, since they won't necessarily be iterated over immediately, I think it's correct to create an actual nested function; you need the effects of closures. With comprehensions, it's less obvious, and what you're asking for might be more plausible. The question is, how important is the parallel between list(x for x in iter) and [x for x in iter] ? Guido likes it, and to create that parallel, list comps MUST be in their own functions too.
That's a fair point. But there is another equally valid use-case for assignment expressions inside list comps: values = [y + 2 for x in iter if (y := f(x)) > 0] In this case, it's just as obvious that the name 'y' should be local to the comprehension, as 'x' is. Since there's no way to declare "nonlocal y" inside the comprehension, you're left with a small handful of options: 1) All names inside list comprehensions are common with their surrounding scope. The comprehension isn't inside a function, the iteration variable leaks, you can share names easily. Or if it *is* inside a function, all its names are implicitly "nonlocal" (in which case there's not much point having the function). 2) All names are local to their own scope. No names leak, and that includes names made with ":=". 3) Some sort of rule like "iteration variables don't leak, but those used with := are implicitly nonlocal". Would create odd edge cases eg [x for x in iter if x := x] and that would probably result in x leaking. 4) A special adornment on local names if you don't want them to leak 5) A special adornment on local names if you DO want them to leak 6) A combination of #3 and #4: "x := expr" will be nonlocal, ".x := expr" will be local, "for x in iter" will be local. Backward compatible but a pain to explain. I can't say I'm a fan of any of the complicated ones (3 through 6). Option #2 is current status - the name binding is part of the expression, the expression is inside an implicit function, so the name is bound within the function. Option 1 is plausible, but would be a backward compatibility break, with all the consequences thereof. It'd also be hard to implement cleanly with genexps, since they MUST be functions. (Unless they're an entirely new concept of callable block that doesn't include its own scope, which could work, but would be a boatload of new functionality.)
Is it really more useful more often?
Personally, I'd still like to go back to := creating a statement-local name, one that won't leak out of ANY statement. But the tide was against that one, so I gave up on it. ChrisA

[Chris Angelico <rosuav@gmail.com>]
I don't care how they're implemented here; I only care here about the visible semantics.
There's a difference, though: if `y` "leaks", BFD. Who cares? ;-) If `y` remains inaccessible, there's no way around that.
Since there's no way to declare "nonlocal y" inside the comprehension, you're left with a small handful of options:
i leapt straight to #3:
DOA. Breaks old code.
2) All names are local to their own scope. No names leak, and that includes names made with ":=".
Saying "local to their own scope" _assumes_ what you're trying to argue _for_ - it's circular. In fact it's impossible to know what the user intends the scope to be.
3) Some sort of rule like "iteration variables don't leak, but those used with := are implicitly nonlocal".
Explicitly, because "LHS inherits scope from its context" (whether global, nonlocal, or local) is part of what ":=" is defined to _mean_ then.
Would create odd edge cases eg [x for x in iter if x := x] and that would probably result in x leaking.
Don't care.
4) A special adornment on local names if you don't want them to leak
5) A special adornment on local names if you DO want them to leak
Probably also DOA.
Definitely DOA. ...
Is it really more useful more often?
I found no comprehensions of any kind in my code where binding expressions would actually be of use unless the name "leaked". Other code bases may, of course, yield different guesses. I'm not a "cram a lot of stuff on each line" kind of coder. But the point above remains: if they don't leak, contexts that want them to leak have no recourse. If they do leak, then the other uses would still work fine, but they'd possibly be annoyed by a leak they didn't want.
Part of that is because - as the existence of this thread attests to - we can't even control all the scopes gimmicks Python already has. So people are understandably terrified of adding even more ;-)

On Mon, May 7, 2018 at 12:34 PM, Tim Peters <tim.peters@gmail.com> wrote:
That's Steve D'Aprano's view - why not just let them ALL leak? I don't like it though.
Sorry, I meant "local to the comprehension's scope". We can't know the user's intention. We have to create semantics before the user's intention even exists.
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation: stuff = [x + 1 for x in iter if x % 3] stuff = [] for x in iter: if x % 3: stuff.append(x + 1) Now 'x' leaks as well, and it's more consistent with how people explain comprehensions. Is that a good thing? I don't think so. Having the iteration variable NOT leak means it's a self-contained unit that simply says "that thing we're iterating over".
Part of it is just that people seem to be fighting for the sake of fighting. I'm weary of it, and I'm not going to debate this point with you. You want 'em to leak? No problem. Implement it that way and I'm not going to argue it. ChrisA

[Tim]
There's a difference, though: if `y` "leaks", BFD. Who cares? ;-) If `y` remains inaccessible, there's no way around that.
[Chris]
That's Steve D'Aprano's view - why not just let them ALL leak? I don't like it though.
I didn't suggest that. I'm not suggesting changing _any_ existing behavior (quite the contrary). Since ":=" would be brand new, there is no existing behavior for it.
Exactly. That's why I would like ":-=" to be defined from the start in a way that does least damage ;-)
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation:
Again, I'm not suggesting changing any existing behavior.
It's fine by me that for-target names don't leak. I didn't suggest changing that.
I'm more interested in real-life use cases than in arguments. My suggestion came from staring at my real-life use cases, where binding expressions in comprehensions would clearly be more useful if the names bound leaked. Nearly (but not all) of the time,, they're quite happy with that for-target names don't leak. Those are matters of observation rather than of argument.

On 7 May 2018 at 13:15, Tim Peters <tim.peters@gmail.com> wrote:
The issue is that because name binding expressions are just ordinary expressions, they can't be defined as "in comprehension scope they do X, in other scopes they do Y" - they have to have consistent scoping semantics regardless of where they appear. However, it occurs to me that a nonlocal declaration clause could be allowed in comprehension syntax, regardless of how any nested name bindings are spelt: p = rem = None while any((rem := n % p) for p in small_primes nonlocal (p, rem)): # p and rem were declared as nonlocal in the nested scope, so our rem and p point to the last bound value I don't really like that though, since it doesn't read as nicely as being able to put the nonlocal declaration inline. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Nick Coghlan <ncoghlan@gmail.com>]
While I'm not generally a fan of arguments, I have to concede that's a really good argument :-) Of course their definition _could_ be context-dependent, but even I'll agree they shouldn't be. Never mind!
If the idea gets traction, I'm sure we'll see 100 other syntax ideas by the time I wake up again.

For what it's worth, i'm totally +1 on inline uses of global and nonlocal. As a related improvement, i'd also like it if "global x = 5" would be a legal statement. As a noob learning python, I was suprised to find out I couldn't and had to split it on two lines.(aside from a 9-hour course of C and some labview (which I totally hate), python was my first language and still the one im by far most proficient with.) 2018-05-07 6:04 GMT+02:00 Tim Peters <tim.peters@gmail.com>:

On Mon, May 07, 2018 at 12:48:53PM +1000, Chris Angelico wrote:
On Mon, May 7, 2018 at 12:34 PM, Tim Peters <tim.peters@gmail.com> wrote:
I know popular opinion is against me, and backward compatibility and all that, but I wish that generator expressions and comprehensions ran in their surrounding scope, like regular for statements. (Yes, I know that makes generator expressions tricky to implement. As the guy who doesn't have to implement it, I don't have to care :-) Calling it a "leak" assumes that it is a bad thing. I don't think it is a bad thing. It's not often that I want to check the value of a comprehension loop, but when I do, I have to tear the comprehension apart into a for-loop. Even if it is only temporarily, for debugging, then put the comprehension back together. The only time I can see it is a bad thing is if I blindly copy and paste a comprehension out of one piece of code and dump it into another piece of code without checking to see that it slots in nicely without blowing away existing variables. But if you're in the habit of blindly and carelessly pasting into your code base without looking it over, this is probably the least of your worries... *wink* But what's done is done, and while there are plenty of windmills I am willing to tilt at, reversing the comprehensions scope decision is not one of them. [...]
Surely that's backwards? We ought to find out what people want before telling them that they can't have it :-)
Indeed.
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation:
You know we can't do that. But we do have a choice with binding expressions. *Either way*, whatever we do, we're going to upset somebody, so we simply have to decide who that will be. Actually we have at least three choices: (1) Consistency Über Alles (whether foolish or not) Now that comprehensions are their own scope, be consistent about it. Binding expressions inside the comprehension will be contained to the comprehension. I'll hate it, but at least it is consistent and easy to remember: the entities which create a new scope are modules, classes, functions, plus comprehensions. That's going to cut out at least one motivating example though. See below. (2) Binding assignments are *defined* as "leaking", or as I prefer, defined as existing in the lexical scope that contains the comprehension. Hence: # module level [(x := a) for a in [98, 99]] assert x == 99 # class level class X: [(x := a) for a in [98, 99]] assert X.x == 99 # function level def func(): [(x := a) for a in [98, 99]] assert x == 99 Note that in *all* of these cases, the variable a does not "leak". This will helpfully support the "running total" use-case that began this whole saga: total = 0 running_totals = [(total := total + x) for x in [98, 99]] assert total == 197 (I have to say, given that this was THE motivating use-case that began this discussion, I think it is ironic and not a little sad that the PEP has evolved in a direction that leaves this use-case unsatisfied.) (3) A compromise: binding assignments are scoped local to the comprehension, but they are initialised from their surrounding scope. This would be similar to the way Lua works, as well as function parameter defaults. I have some vague ideas about implementation, but there's no point discussing that unless people actually are interested in this option. This will *half* satisfy the running-total example: total = 0 running_totals = [(total := total + x) for x in [98, 99]] assert total == 0 Guaranteed to generate at least two Stackoverflow posts a month complaining about it, but better than nothing :-)
Assuming there are no side-effects to any of the operations inside the comprehension.
Part of it is just that people seem to be fighting for the sake of fighting.
Them's fightin' words! *wink* Honestly Chris, I know this must be frustrating, but I'm not fighting for the sake of it, and I doubt Tim is either. I'm arguing because there are real use-cases which remain unmet if binding-variables inside comprehensions are confined to the comprehension. -- Steve

On Mon, May 7, 2018 at 9:42 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yeah, it's really easy when you don't have to worry about how on earth you can implement the concept of "unexecuted block of code that can be executed later even after the surrounding context has returned, but which isn't a function". :)
Does it HAVE to be initialised from the surrounding scope? What if the surrounding scope doesn't have that variable? stuff = [spam for x in items if (spam := f(x)) < 0] Is this going to NameError if you haven't defined spam? Or is the compiler to somehow figure out whether or not to pull in a value? Never mind about implementation - what are the semantics?
Which is often the case.
I don't think you are and I don't think Tim is. But do you honestly want to say that about EVERY person in these threads? ChrisA

On Mon, May 07, 2018 at 10:38:51PM +1000, Chris Angelico wrote:
No. Then it's just an uninitialised local, and it is a NameError to try to evaluate it before something gets bound to it.
stuff = [spam for x in items if (spam := f(x)) < 0]
Is this going to NameError if you haven't defined spam?
It shouldn't be an error, because by the time the comprehension looks up the value of "spam", it has been bound by the binding-expression.
Variable "spam" is not defined in any surrounding scope, so these ought to all be NameError (or UnboundLocalError): [spam for a in it] # obviously [(spam := spam + a) for a in it] [spam if True else (spam := a) for a in it] [spam for a in it if True or (spam := a)] They are errors because the name "spam" is unbound when you do a lookup. This is not an error, because the name is never looked up: [True or spam for a in it if True or (spam := a)] Although "spam" never gets bound, neither does it get looked up, so no error. The next one is also okay, because "spam" gets bound before the first lookup: [(spam := spam+1) for a in it if (spam := a*2) > 0] Here's a sketch of how I think locals are currently handled: 1. When a function is compiled, the compiler does a pass over the source and determines which locals are needed. 2. The compiler builds an array of slots, one for each local, and sets the initial value of the slot to "empty" (undefined). 3. When the function is called, if it tries reading from a local's slot which is still empty, it raises UnboundLocalError. (am I close?) Here's the change I would suggest: 2. The compiler builds an array of slots, one for each local: 2a. For locals that are the target of binding-expression only: - look up the target in the current scope (that is, not the comprehension's scope, but the scope that the comprehension is inside) using the normal lookup rules, as if you were compiling "lambda x=x: None" and needed the value of x; - if the target is undefined, then swallow the error and leave the slot as empty; - otherwise store a reference to that value in the slot. 2b. For all other locals, proceed as normal.
I'm going to assume good faith, no matter the evidence :-) -- Steve

On Tue, May 8, 2018 at 12:26 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Yeah, I'm pretty sure that's all correct.
It's easy when you're not implementing things. I'm going to just say "sure, go for it", and also not implement it. Have fun, whoever goes in and tries to actually do the work... I don't think there's any other construct in Python that can replicate "a thing or the absence of a thing" in that way. For instance, there's no way to copy a value from one dict into another, or delete from the target dict if it isn't in the source (other than a long-hand). I've no idea how it would be implemented, and don't feel like finding out. ChrisA

I am convinced by Tim's motivation. I hadn't thought of this use case before -- I had mostly thought "local scope in a comprehension or generator expression is the locals of the synthetic function". But Tim's reasoning feels right. The only solution that makes sense to me is Steven's (2). (1) is what the PEP currently says and what Tim doesn't want; (3) has no precedent (function defaults don't really work like this) and just gets my hackles all up. (I can't even tell if Steven meant it as a serious proposal.) So let's see if we can change PEP 572 so that := inside a comprehension or generator expression al ways assigns to a variable in the containing scope. It may be inconsistent with the scope of the loop control variable, but it's consistent with uses of := outside comprehensions: [x := 0] [x := i for i in range(1)] both have the side effect of setting x to zero. I like that. There's one corner case (again) -- class scopes. If the containing scope is a function, everything's fine, we can use the existing closure machinery. If the containing scope is global, everything's fine too, we can treat it as a global. But if the containing scope is a class, we can't easily extend the current machinery. But this breakage is similar to the existing breakage with comprehensions in class scope that reference class variables: class C: hosts = ['boring', 'haring', 'tering'] full_hosts = [host + suffix for suffix in ('.cwi.nl', '.com') for host in hosts] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in C File "<stdin>", line 3, in <listcomp> NameError: name 'hosts' is not defined I propose to punt on this case. If we want to fix it we can fix it in a number of ways and the fix can easily apply to both getting and setting -- but this is a separate fix (and we should take it out of PEP 572). PS1. The various proposals that add random extra keywords to the syntax (like 'for nonlocal i') don't appeal to me at all. PS2. IIRC the reason we gave loop control variables their own scope was the poor imagination of many people when it comes to choosing loop control variable names. We had seen just too many examples of for x in something: ...lots of code using x... blah blah [x+1 for x in something else] ...some more code using x, broken... It's true that this can also happen with a for-loop statement nested inside the outer loop (and it does) but the case of a comprehension was easier to miss. I've never looked back. PS3. Comprehensions and generator expressions should be interchangeable. They just look too similar to have different semantics (and the possibly delayed execution of generator expressions is not an issue -- it's rare, and closure semantics make it work just fine). -- --Guido van Rossum (python.org/~guido)

[Guido]
I'm trying very hard _not_ to reason. That is, I'm looking at code and trying to figure out what would actually work well, logic & consistency & preconceptions be damned. "Reasons" can be made up after the fact - which is how the world actually works regardless ;-)
He doesn't want (3) either. I can channel him on that.
So let's see if we can change PEP 572 so that := inside a comprehension or generator expression always assigns to a variable in the containing scope.
While I don't have real use cases beyond that, given that much, "consistency" kicks in to suggest that: def f(): [x := 42 for x in range(1)] makes `x` local to `f` despite that x wasn't bound elsewhere in f's body. def f(): global x [x := 42 for x in range(1)] binds the global `x`. def f(): nonlocal x [x := 42 for x in range(1)] binds `x` in the closest-containing scope in which `x` is local. The last two act as if the declaration of `x` in `f` were duplicated at the start of the synthetic function. More succinctly, that `x := RHS` in a synthetic function "act the same" as `x = RHS` appearing in the scope directly containing the synthetic function. Does that generalize to class scope too? I don't know. I never use fancy stuff in class scopes, and have no idea how they work anymore. So long as "def method(self, ...)" continues to work, I'm happy ;-)
And is what anyone would expect if they didn't think too much about it. ... [snipping stuff about class scope - nothing to add] ...
PS1. The various proposals that add random extra keywords to the syntax (like 'for nonlocal i') don't appeal to me at all.
They're appealing to the extent that "explicit is better than implicit" for people who actually understand how all this stuff is implemented. I don't know what percentage of Python programmers that is, though. I've certainly, e.g., seen many on Stackoverflow who can make a list comprehension work who couldn't distinguish a lexical scope from an avocado. The ":= is treated specially in comprehensions" idea is aimed more at them than at people who think invoking a synthetic anonymous lambda "is obvious".
I don't want to change anything about any of that - already believe Python struck the best balance possible.
Wholly agreed.

[Tim]
Oh, fudge - I wasn't trying to make a silly subtle point by reusing `x` as the `for` variable too. Pretend those all said "for i in range(1)" instead. Of course what happens if `x` is used in both places needs to be defined, but that's entirely beside the intended point _here_.

[Tim]
It occurs to me that, while suggestive, this is an unhelpful way to express it. It's not at all that the semantics of ":=" change inside a listcomp/genexp, it's that the latter's idea of intended _scopes_ for names is made more nuanced (inside a synthetic function created to implement a listcomp/genexp, names bound by "=" are local; names bound by ":=" are nonlocal; names bound by both are "who cares?"- compiler-time error would be fine by me, or the first person to show a real use case wins). Regardless, the runtime implementation of ":=" remains the same everywhere.

Wait, you can't use = in a listcomp, right ? Or are you talking about the implementation hidden to the casual user ? I thought letting := bind to the surrounding scope was fine basically because it's currently not possible, so therefore there would be no syntactic ambiguity, and it'd actually do what people would expect. Jacco

[Tim]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
Wait, you can't use = in a listcomp, right ? Or are you talking about the implementation hidden to the casual user ?
Sorry, I was too obscure there - I intended "=" to mean "name binding by any means other than :=". Off the top of my head, I believe that - today - the only "any means other than :=" possible in a listcomp/genexp is appearing as a target in a `for` clause (like the `i` in "[i+1 for i in iterable]`). If there's some other way I missed, I meant to cover that too. But, yes, you're right, `names bound by "="` makes no literal sense at all there.
It's not really about the semantics of `:=` so much as about how synthetic functions are defined. In most cases, it amounts to saying "in the nested function synthesized for a listcomp/genexp, if a name `x` appears as the target of a binding expression in the body, a `nonlocal x` declaration is generated near the top of the synthetic function". For example, if this block appears inside a function: it = (i for i in range(10)) total = 0 for psum in (total := total + value for value in it): print(psum} under the current PEP meaning it blows up in the same way this code blows up today: it = (i for i in range(10)) total = 0 def _func(it): for value in it: total = total + value # blows up here yield total for psum in _func(it): print(psum) with UnboundLocalError: local variable 'total' referenced before assignment But add nonlocal total at the top of `_func()` and it works fine (displays 0, 1, 3, 6, 10, 15, ...). So it's not really about what ":=" does, but about how ":=" affects scope in synthesized nested functions. But if you wrote a nested function yourself? There's no suggestion here that ":=" have any effect on scope decisions in explicitly given nested functions (same as for "=", it would imply "the target is local"), just on those generated "by magic" for listcomps/genexps. Maybe there should be, though. My initial thought was "no, because the user has total control over scope decisions in explicitly given functions today, but if something was magically made nonlocal they would have no way to override that".

So the way I envision it is that *in the absence of a nonlocal or global declaration in the containing scope*, := inside a comprehension or genexpr causes the compiler to assign to a local in the containing scope, which is elevated to a cell (if it isn't already). If there is an explicit nonlocal or global declaration in the containing scope, that is honored. Examples: # Simplest case, neither nonlocal nor global declaration def foo(): [p := q for q in range(10)] # Creates foo-local variable p print(p) # Prints 9 # There's a nonlocal declaration def bar(): p = 42 # Needed to determine its scope def inner(): nonlocal p [p := q for q in range(10)] # Assigns to p in bar's scope inner() print(p) # Prints 9 # There's a global declaration def baz(): global p [p := q for q in range(10)] baz() print(p) # Prints 9 All these would work the same way if you wrote list(p := q for q in range(10)) instead of the comprehension. We should probably define what happens when you write [p := p for p in range(10)]. I propose that this overwrites the loop control variable rather than creating a second p in the containing scope -- either way it's probably a typo anyway. := outside a comprehension/genexpr is treated just like any other assignment (other than in-place assignment), i.e. it creates a local unless a nonlocal or global declaration exists. -- --Guido van Rossum (python.org/~guido)

So the way I envision it is that *in the absence of a nonlocal or global
This seems to be getting awfully complicated. Proof? Try to write the docs for the proposed semantics. I don't understand why we went so astray from the original requirements, which could all be met by having `if` and `while` accept `as` to bind an expression to a variable that would be local to the structured statement. Cheers, -- Juancarlo *Añez*

On Tue, May 08, 2018 at 01:28:59PM -0400, Juancarlo Añez wrote:
Okay, I'll bite. I don't know why you think its complicated: it is precisely the same as ordinary ``=`` assignment scoping rules. It is comprehensions that are the special case. * * * The binding expression ``<name> := <value>`` evaluates the right hand side <value>, binds it to <name>, and then returns that value. Unless explicitly declared nonlocal or global (in which case that declaration is honoured), <name> will belong to the current scope, the same as other assignments such ``name = value``, with one difference. Inside comprehensions and generator expressions, variables created with ``for name in ...`` exist in a separate scope distinct from the usual local/nonlocal/global/builtin scopes, and are inaccessible from outside the comprehension. (They do not "leak".) That is not the case for those created with ``:=``, which belong to the scope containing the comprehension. To give an example: a = 0 x = [b := 10*a for a in (1, 2, 3)] assert x == [10, 20, 30] assert a = 0 assert b = 30
That is not the original motivation for binding expressions. The original requirements were specifically for comprehensions. https://mail.python.org/pipermail/python-ideas/2018-February/048971.html This is hardly the only time that something similar has been raised. -- Steve

[Guido]
[Juancarlo Añez <apalala@gmail.com>]
This seems to be getting awfully complicated. Proof? Try to write the docs for the proposed semantics.
Implementation details - even just partial sketches - are always "busy". Think of it this way instead: it's _currently_ the case that listcomps & genexps run in a scope S that's the same as the scope C that contains them, _except_ that names appearing as `for` targets are local to S. All other names in S resolve to exactly the same scopes they resolved to in C (local in C, global in C, nonlocal in C - doesn't matter). What changes now? Nothing in that high-level description, except that a name appearing as a binding expression target in S that's otherwise unknown in C establishes that the name is local to C. That's nothing essentially new, though - bindings _always_ establish scopes for otherwise-unknown names in Python.
"Original" depends on when you first jumped into this ;-)

[Tim] {About binding the for loop variable}
Yeah, that binding is the one I attempted to refer to. So I did understand you after all.
My naive assumption would be both. If it's just the insertion of a nonlocal statement like Tim suggested, wouldn't the comprehension blow up to: def implicitfunc() nonlocal p templist = [] for p in range(10): p = p templist.append(p) return templist ? If it were [q := p for p in range(10)], it would be: def implicitfunc() nonlocal q templist = [] for p in range(10): q = p templist.append(q) return templist Why would it need to be treated differently ? (type checkers probably should, though.)
x = x is legal. Why wouldn't p := p be ?
[Juancarlo] Maybe my distrust is just don't like the new syntax, or that I'am biased towards using "as".
I share your bias, but not your distrust. (go as !)

My apologies for something unclear in my previous mail - the second block I quoted (the one without a name) originated from Guido, not from Tim.

... [Guido]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
My naive assumption would be both.
Since this is all about scope, while I'm not 100% sure of what Guido meant, I assumed he was saying "p can only have one scope in the synthetic function: local or non-local, not both, and local is what I propose". For example, let's flesh out his example a bit more: p = 42 [p := p for p in range(10) if p == 3] print(p) # 42? 3? 9? If `p` is local to the listcomp, it must print 42. If `p` is not-local, it must print 9. If it's some weird mixture of both, 3 makes most sense (the only time `p := p` is executed is when the `for` target `p` is 3).
If it's just the insertion of a nonlocal statement like Tim suggested,
Then all occurrences of `p` in the listcomp are not-local, and the example above prints 9..
Yes.
There's no question about that one, because `q` isn't _also_ used as a `for` target. There are two "rules" here: 1. A name appearing as a `for` target is local. That's already the case. 2. All other names (including a name appearing as a binding-expression target) are not local. Clearer? If a name appears as both, which rule applies? "Both" is likely the worst possible answer, since it's incoherent ;-) If a name appears as both a `for` target and as a binding-expression target, that particular way of phrasing "the rules" suggests #1 (it's local, period) is the more natural choice. And, whether Guido consciously knows it or not, that's why he suggested it ;-)
Why would it need to be treated differently ?
Because it's incoherent. It's impossible to make the example above print 3 _merely_ by fiddling the scope of `p`. Under the covers, two distinct variables would need to be created, both of which are named `p` as far as the user can see. For my extension of Guido's example: def implicitfunc() nonlocal p templist = [] for hidden_loop_p in range(10): if hidden_loop_p == 3: p = hidden_loop_p templist.append(hidden_loop_p) return templist [Tim]
x = x is legal. Why wouldn't p := p be ?
It's easy to make it "legal": just say `p is local, period` or `p is not local, period`. The former will confuse people who think "but names appearing as binding-expression targets are not local", and the latter will confuse people who think "but names appearing as `for` targets are local". Why bother? In the absence of an actual use case (still conspicuous by absence), I'd be happiest refusing to compile such pathological code. Else `p is local, period` is the best pointless choice ;-)

With my limited experience, I'd consider 3 to make most sense, but 9 when thinking about it in the expanded form. If it's not 3 tho, then the following would make most sense: SyntaxError("Cannot re-bind for target name in a list comprehension") # Or something more clear. And the rest of that mail that convinces me even more that an error would be the correct solution here. Before I got on this mailinglist, i never even knew comprehensions introduced a new scope. I'm really that new. Two years ago I'd look up stackoverflow to check the difference between overriding and extending a method and to verify whether I made my super() calls the right way. If something goes to weird, I think just throwing exceptions is a sensible solution that keeps the language simple, rather than making that much of a headache of something so trivially avoided. Jacco

[Tim]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
Good news, then: Nick & Guido recently agreed that it would be a compile-time error. Assuming it's added to the language at all, of course.
Before I got on this mailinglist, i never even knew comprehensions introduced a new scope. I'm really that new.
They didn't, at first. That changed over time. The real reason was so that `for` variables - which people typically give little thought to naming - didn't accidentally overwrite local variables that happened to share the same name. Like:
But you can productively use list comprehensions without knowing anything about how they're implemented, and just think "ha! Python does some happy magic for me there :-)".
Since Guido agreed with you in this case, that proves you're a true Pythonista - or maybe just that you're both Dutch ;-)

These all match my expectations. Some glosses: [Guido]
If the genexp/listcomp is at module level, then "assign to a local in the containing scope" still makes sense ("locals" and "globals" mean the same thing at module level), but "elevated to a cell" doesn't then - it's just a plain global. In absolutely all cases, what I expect is that NAME := EXPR in a genexp/listcomp do the binding _as if_ NAME = object_EXPR_evaluates_to were executed in the immediately containing scope. Describing the goal instead of part of the implementation may be easier to grasp ;-)
100% agreed. Add at module scope: [p := q for q in range(10)] print(p) # Prints 9 But uou're on your own for class scope, because I never did anything fancy enough at class scope to need to learn how it works ;-)
A compile-time error would be fine by me too. Creating two meanings for `p` is nuts - pick one in case of conflict. I suggested before that the first person with a real use case for this silliness should get the meaning their use case needs, but nobody bit, so "it's local then" is fine.
Also agreed. People have total control over scopes in explicitly given functions now, and if the compiler magically made anything nonlocal they would have no way to stop it. Well, I suppose we could add a "non_nonlocal" declaration, but I'd rather not ;-)

On 9 May 2018 at 03:57, Tim Peters <tim.peters@gmail.com> wrote:
I'd suggest that the handling of conflicting global and nonlocal declarations provides a good precedent here:
Since using a name as a binding target *and* as the iteration variable would effectively be declaring it as both local and nonlocal, or as local and global. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 9 May 2018 at 03:06, Guido van Rossum <guido@python.org> wrote:
How would you expect this to work in cases where the generator expression isn't immediately consumed? If "p" is nonlocal (or global) by default, then that opens up the opportunity for it to be rebound between generator steps. That gets especially confusing if you have multiple generator expressions in the same scope iterating in parallel using the same binding target: # This is fine gen1 = (p for p in range(10)) gen2 = (p for p in gen1) print(list(gen2)) # This is not (given the "let's reintroduce leaking from comprehensions" proposal) p = 0 gen1 = (p := q for q in range(10)) gen2 = (p, p := q for q in gen1) print(list(gen2)) It also reintroduces the original problem that comprehension scopes solved, just in a slightly different form: # This is fine for x in range(10): for y in range(10): transposed_related_coords = [y, x for x, y in related_coords(x, y)] # This is not (given the "let's reintroduce leaking from comprehensions" proposal) for x in range(10): for y in range(10): related_interesting_coords = [x, y for x in related_x_coord(x, y) if is_interesting(y := f(x))] Deliberately reintroducing stateful side effects into a nominally functional construct seems like a recipe for significant confusion, even if there are some cases where it might arguably be useful to folks that don't want to write a named function that returns multiple values instead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, May 10, 2018 at 5:17 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's just one of several "don't do that" situations. *What will happen* is perhaps hard to see at a glance, but it's perfectly well specified. Not all legal code does something useful though, and in this case the obvious advice should be to use different variables.
You should really read Tim's initial post in this thread, where he explains his motivation. It sounds like you're not buying it, but your example is just a case where the user is shooting themselves in the foot by reusing variable names. When writing `:=` you should always keep the scope of the variable in mind -- it's no different when using `:=` outside a comprehension. PS. Thanks for the suggestion about conflicting signals about scope; that's what we'll do. -- --Guido van Rossum (python.org/~guido)

On 10 May 2018 at 23:22, Guido van Rossum <guido@python.org> wrote:
I can use that *exact same argument* to justify the Python 2 comprehension variable leaking behaviour. We decided that was a bad idea based on ~18 years of experience with it, and there hasn't been a clear justification presented for going back on that decision presented beyond "Tim would like using it sometimes". PEP 572 was on a nice trajectory towards semantic simplification (removing sublocal scoping, restricting to name targets only, prohibiting name binding expressions in the outermost iterable of comprehensions to avoid exposing the existing scoping quirks any more than they already are), and then we suddenly had this bizarre turn into "and they're going to be implicitly nonlocal or global when used in comprehension scope".
I did, and then I talked him out of it by pointing out how confusing it would be to have the binding semantics of "x := y" be context dependent.
It *is* different, because ":=" normally binds the same as any other name binding operation including "for x in y:" (i.e. it creates a local variable), while at comprehension scope, the proposal has now become for "x := y" to create a local variable in the containing scope, while "for x in y" doesn't. Comprehension scoping is already hard to explain when its just a regular nested function that accepts a single argument, so I'm not looking forward to having to explain that "x := y" implies "nonlocal x" at comprehension scope (except that unlike a regular nonlocal declaration, it also implicitly makes it a local in the immediately surrounding scope). It isn't reasonable to wave this away as "It's only confusing to Nick because he's intimately familiar with how comprehensions are implemented", as I also wrote some of the language reference docs for the current (already complicated) comprehension scoping semantics, and I can't figure out how we're going to document the proposed semantics in a way that will actually be reasonably easy for readers to follow. The best I've been able to come up with is: - for comprehensions at function scope (including in a lambda expression inside a comprehension scope), a binding expression targets the nearest function scope, not the comprehension scope, or any intervening comprehension scope. It will appear in locals() the same way nonlocal references usually do. - for comprehensions at module scope, a binding expression targets the global scope, not the comprehension scope, or any intervening comprehension scope. It will not appear in locals() (as with any other global reference). - for comprehensions at class scope, the class scope is ignored for purposes of determining the target binding scope (and hence will implicitly create a new global variable when used in a top level class definition, and new function local when used in a class definition nested inside a function) Sublocal scopes were a model of simplicity by comparison :) Cheers, Nick. P.S. None of the above concerns apply to explicit inline scope declarations, as those are easy to explain by saying that the inline declarations work the same way as the scope declaration statements do, and can be applied universally to all name binding operations rather than being specific to ":= in comprehension scope". -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Just a quickie - I'm out of time for now. [Guido]
[Nick]
Here's the practical difference: you can't write a listcomp or genexp AT ALL without a "for" clause, so whether "for" target names leak is an issue in virtually every listcomp or genexp ever written. Here's one where it isn't: [None for somelist[12] in range(10)] Which nobody has ever seen in real life ;-) But ":=" is never required to write one - you only use it when you go out of your way to use it. I expect that will be relatively rare in real life listcomps and genexps.
and there hasn't been a clear justification presented for going back on that decision
Nobody is suggesting going back on "all and only `for` target names are local to the genexp/listcomp". To the contrary, the proposal preserves that verbatim: It's not _adding_ "oh, ya, and binding operator targets are local too". Just about everything here follows from _not_ adding that.
presented beyond "Tim would like using it sometimes".
So long as I'm the only one looking at real-life use cases, mine is the only evidence I care about ;-) I don't really care about contrived examples, unless they illustrate that a proposal is ill-defined, impossible to implement as intended, or likely to have malignant unintended consequences out-weighing their benefits.

On 2018-05-10 11:05, Tim Peters wrote:
You keep saying things like this with a smiley, and I realize you know what you're talking about (much more than I do), but I'd just like to push back a bit against that entire concept. Number one, I think many people have been bringing in real life uses cases. Number two, I disagree with the idea that looking at individual use cases and ignoring logical argumentation is the way to go. The problem with it is that a lot of the thorny issues arise in unanticipated interactions between constructs that were designed to handle separate use cases. I also do not think it's appropriate to say "if it turns out there's a weird interaction between two features, then just don't use those two things together". One of the great things about Python's design is that it doesn't just make it easy for us to write good code, but in many ways makes it difficult for us to write bad code. It is absolutely a good idea to think of the broad range of wacky things that COULD be done with a feature, not just the small range of things in the focal area of its intended use. We may indeed decide that some of the wacky cases are so unlikely that we're willing to accept them, but we can only decide that after we consider them. You seem to be suggesting that we shouldn't even bother thinking about such corner cases at all, which I think is a dangerous mistake. Taking the approach of "this individual use case justifies this individual feature", leads to things like JavaScript, a hellhole of special cases, unintended consequences, and incoherence between different corners of the language. There are real cognitive benefits to having language features make logical and conceptual sense IN ADDITION TO having practical utility, and fit together into a unified whole. Personally my feeling on this whole thread is that these changes, if implemented are likely to decrease the average readability of Python code, and I don't see the benefits as being worth the added complexity. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

[Tim]
[Brendan Barnwell <brenbarn@brenbarn.net>]
I'm not so keen on meta-discussions either ;-)
Number one, I think many people have been bringing in real life uses cases.
Keep in mind the context here: _this_ thread is specifically about listcomps and genexps. I agree there have been tons of use cases presented for statement-oriented applications (some positive for the feature, some negative), but not so much for listcomps and genexps. It's worth noting again that "the" use case that started all this long ago was a listcomp that the current PEP points out still "won't work": total = 0 progressive_sums = [total := total + value for value in data] It's obvious what that's intended to do. It's not obvious why it blows up. It's a question of scope, and the scopes of names in synthesized functions is a thoroughly legitimate thing to question. The suggestion made in the first message of this thread was the obvious scope change needed to make that example work, although I was motivated by looking at _other_ listcomp/genexp use cases. They wanted the same scope decision as the example above. But I didn't realize that the example above was essentially the same thing until after I made the suggestion.
Number two, I disagree with the idea that looking at individual use cases and ignoring logical argumentation is the way to go.
Fine, then you argue, and I'll look at use cases ;-) Seriously, I don't at all ignore argument - but, yes, arguments are secondary to me. I don't give a rip about how elegant something is if it turns out to be unusable. Conversely, I don't _much_ care about how "usable" something is if the mental model for how it works is inexplicable.
Sure.
Sometimes it is, sometimes it isn't. For example, code using threads has to be aware of literal mountains of other features that may not work well (or at all) in a multi-threaded environment without major rewriting. Something as simple as "count += 1" may fail in mysterious ways otherwise. So it goes. But note that this is easily demonstrated by realistic code.
That one I disagree with. It's very easy to write bad code in every language I'm aware of. It's just that Python programmers are too enlightened to approve of doing so ;-)
It is absolutely a good idea to think of the broad range of wacky things that COULD be done with a feature,
So present some!
To the contrary, bring 'em on. But there is no feature in Python you can't make "look bad" by contriving examples, from two-page regular expressions to `if` statements nested 16 deep. "But no sane person would do that" is usually - but not always - "refutation" enough for such stuff.
I haven't ignored that here. The scope rule for synthesized functions implementing regexps and listcomps _today_ is: The names local to that function are the names appearing as `for` targets. All other names resolve to the same scopes they resolve to in the block containing the synthesized function. The scope rule if the suggestion is adopted? The same, along with that a name appearing as a ":=" target establishes that the name is local to the containing block _if_ that name is otherwise unknown in the containing block. There's nothing incoherent or illogical about that, provided that you understand how Python scoping works at all. It's not, e.g., adding any _new_ concept of "scope" - just spelling out what the intended scopes are. Of course it's worth noting that the scope decision made for ";=" targets in listcomps/genexps differs from the decision made for `for` target names. It's use cases that decide, for me, whether that's "the tail" or "the dog". Look again at the `progressive_sums` example above, and tell me whether _you'll_ be astonished if it works. Then are you astonished that
displays 1? Either way, are you astonished that
also displays 1? If you want to argue about "logical and conceptual sense", I believe you'll get lost in abstractions unless you _apply_ your theories to realistic examples.
Of course consensus will never be reached. That's why Guido is paid riches beyond the dreams of avarice ;-)

There's a lot of things in Brendan's email which I disagree with but will skip to avoid dragging this out even further. But there's one point in particular which I think is important to comment on. On Thu, May 10, 2018 at 11:23:00AM -0700, Brendan Barnwell wrote:
I don't think this concept survives even a cursory look at the language. Make it difficult to write bad code? Let's see now: Anyone who have been caught by the "mutual default" gotcha will surely disagree: def func(arg, x=[]): ... And the closures-are-shared gotcha: py> addone, addtwo, addthree = [lambda x: x + i for i in (1, 2, 3)] py> addone(100) 103 py> addtwo(100) 103 We have no enforced encapsulation, no "private" or "protected" state for classes. Every single pure-Python class is 100% open for modification, both by subclasses and by direct monkey-patching of the class. The term "monkey-patch" was, if Wikipedia is to be believed, invented by the Python community, long before Ruby took to it as a life-style. We have no compile-time type checks to tell us off if we use the same variable as a string, a list, an int, a float and a dict all in the one function. The compiler won't warn us if we assign to something which ought to be constant. We can reach into other modules' namespaces and mess with their variables, even replacing builtins. Far from making it *hard* to do bad things, Python makes it *easy*. And that's how we love it! Consenting adults applies. We trust that code is not going to abuse these features, we trust that people aren't generally going to write list comps nested six levels deep, or dig deep into our module and monkey-patch our functions: import some_module some_module.function.__defaults__ = (123,) # Yes, this works. As a community, we use these powers wisely. We don't make a habit of shooting ourselves in the foot. We don't write impenetrable forests of nested comprehensions inside lambdas or stack ternary-if expressions six deep, or write meta-metaclasses. Binding expressions can be abused. But they have good uses too. I trust the Python community will use this for the good uses, and not change the character of the language. Just as the character of the language was not ruined by comprehensions, ternary-if or decorators. -- Steve

... [Guido]
You should really read Tim's initial post in this thread, where he explains his motivation.
[Nick]
I did, and then I talked him out of it by pointing out how confusing it would be to have the binding semantics of "x := y" be context dependent.
Ya, that was an effective Jedi mind trick when I was overdue to go to sleep ;-) To a plain user, there's nothing about a listcomp or genexp that says "new function introduced here". It looks like, for all the world, that it's running _in_ the block that contains it. It's magical enough that `for` targets magically become local. But that's almost never harmful magic, and often helpful, so worth it.
":=" target names in a genexp/listcmp are treated exactly the same as any other non-for-target name: they resolve to the same scope as they resolve to in the block that contains them. The only twist is that if such a name `x` isn't otherwise known in the block, then `x` is established as being local to the block (which incidentally also covers the case when the genexp/listcomp is at module level, where "local to the block" and "global to the block" mean the same thing). Class scope may be an exception (I cheerfully never learned anything about how class scope works, because I don't write insane code ;-) ).
It doesn't, necessarily. If `x` is already known as `global` in the block, then there's an implied `global x` at comprehension scope.
(except that unlike a regular nonlocal declaration, it also implicitly makes it a local in the immediately surrounding scope).
Only if `x` is otherwise _unknown_ in the block. If, e.g., `x` is already known in an enclosing scope E, then `x` also resolves to scope E in the comprehension. It is not made local to the enclosing scope in that case. I think it's more fruitful to explain the semantics than try to explain a concrete implementation. Python's has a "lumpy" scope system now, with hard breaks among global scopes, class scopes, and all other lexical scopes. That makes implementations artificially noisy to specify. "resolve to the same scope as they resolve to in the block that contains them, with a twist ..." avoids that noise (e.g., the words "global" and "nonlocal" don't even occur), and gets directly to the point: in which scope does a name live? If you think it's already clear enough which scope `y` resolves to in z = (x+y for x in range(10)) then it's exactly as clear which scope `y` resolves to in z = (x + (y := 7) for x in range(10)) with the twist that if `y` is otherwise unknown in the containing block, `y` becomes local to the block.
It isn't reasonable to wave this away as "It's only confusing to Nick because he's intimately familiar with how comprehensions are implemented",
As above, though, I'm gently suggesting that being so intimately familiar with implementation details may be interfering with seeing how all those details can _obscure_ rather than illuminate. Whenever you think you need to distinguish between, e.g., "nonlocal" and "global", you're too deep in the detail weeds.
Where are those docs? I expect to find such stuff in section 4 ("Execution model") of the Language Reference Manual, but listcomps and genexps are only mentioned in passing once in the 3.6.5 section 4 docs, just noting that they don't always play well at class scope.
Isn't all of that too covered by "resolve to the same scope as they resolve to in the block that contains them .."? For example, in class K: print(g) at module level, `g` obviously refers to the global `g`. Therefore any `g` appearing as a ";=" target in an immediately contained comprehension also refers to the global `g`, exactly the same as if `g` were any other non-for-target name in the comprehension. That's not a new rule: it's a consequence of how class scopes already work. Which remain inscrutable to me ;-)
You already know I'd be happy with being explicit too, but Guido didn't like it. Perhaps he'd like it better if it were even _more_ like regular declarations. Off the top of my head, say that a comprehension could start with a new optional declaration section, like def f(): g = 12 i = 8 genexp = (<global g; nonlocal i> g + (j := i*2) for i in range(2)) Of course that's contrived. When the genexp ran, the `g` would refer to the global `g` (and the f-local `g` would be ignored); the local-to-f `i` would end up bound to 1, and in this "all bindings are local by default" world the ":=" binding to `j` would simply vanish when the genexp ended. In practice, I'd be amazed to see anything much fancier than p = None # annoying but worth it ;-) that is, in this world the intended scope # for a nonlocal needs to be explicitly established while any((<nonlocal p> n % p == 0 for p in small_primes)): n //= p Note too: a binding expression (":=") isn't even needed then for this class of use case. OTOH, it's inexplicable _unless_ someone learns something about how a synthetic function is being created to implement the genexp.

On 10 May 2018 at 23:47, Tim Peters <tim.peters@gmail.com> wrote:
That's all well and good, but it is *completely insufficient for the language specification*. For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case" that you're attempting to hand wave away should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-se... appropriately. It isn't 1995 any more - while CPython is still the reference implementation for Python, we're far from being the only implementation, which means we have to be a lot more disciplined about how much we leave up to the implementation to define. The expected semantics for locals() are already sufficiently unclear that they're a source of software bugs (even in CPython) when attempting to run things under a debugger or line profiler (or anything else that sets a trace function). See https://www.python.org/dev/peps/pep-0558/ for details. "Comprehension scopes are already confusing, so it's OK to dial their weirdness all the way up to 11" is an *incredibly* strange argument to be attempting to make when the original better defined sublocal scoping proposal was knocked back as being overly confusing (even after it had been deliberately simplified by prohibiting nonlocal access to sublocals). Right now, the learning process for picking up the details of comprehension scopes goes something like this: * make the technically-incorrect-but-mostly-reliable-in-the-absence-of-name-shadowing assumption that "[x for x in data]" is semantically equivalent to a for loop (especially common for experienced Py2 devs where this really was the case!): _result = [] for x in data: _result.append(x) * discover that "[x for x in data]" is actually semantically equivalent to "list(x for x in data)" (albeit without the name lookup and optimised to avoid actually creating the generator-iterator) * make the still-technically-incorrect-but-even-more-reliable assumption that the generator expression "(x for x in data)" is equivalent to def _genexp(): for x in data: yield x _result = _genexp() * *maybe* discover that even the above expansion isn't quite accurate, and that the underlying semantic equivalent is actually this (one way to discover this by accident is to have a name error in the outermost iterable expression): def _genexp(_outermost_iter): for x in _outermost_iter: yield x _result = _genexp(_outermost_iter) * and then realise that the optimised list comprehension form is essentially this: def _listcomp(_outermost_iter): result = [] for x in _outermost_iter: result.append(x) return result _result = _listcomp(data) Now that "yield" in comprehensions has been prohibited, you've learned all the edge cases at that point - all of the runtime behaviour of things like name references, locals(), lambda expressions that close over the iteration variable, etc can be explained directly in terms of the equivalent functions and generators, so while comprehension iteration variable hiding may *seem* magical, it's really mostly explained by the deliberate semantic equivalence between the comprehension form and the constructor+genexp form. (That's exactly how PEP 3100 describes the change: "Have list comprehensions be syntactic sugar for passing an equivalent generator expression to list(); as a consequence the loop variable will no longer be exposed") As such, any proposal to have name bindings behave differently in comprehension and generator expression scope from the way they would behave in the equivalent nested function definitions *must be specified to an equivalent level of detail as the status quo*. All of the attempts at such a definition that have been made so far have been riddled with action and a distance and context-dependent compilation requirements: * whether to implicitly declare the binding target as nonlocal or global depends on whether or not you're at module scope or inside a function * the desired semantics at class scope have been left largely unclear * the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear Now, there *are* ways to resolve these problems in a coherent way, and that would be to define "parent local scoping" as a new scope type, and introduce a corresponding "parentlocal NAME" compiler declaration to explicitly request those semantics for bound names (allowing the expansions of comprehensions and generator expressions as explicitly nested functions to be adjusted accordingly). But the PEP will need to state explicitly that that's what it is doing, and fully specify how those new semantics are expected to work in *all* of the existing scope types, not just the two where the desired behaviour is relatively easy to define in terms of nonlocal and global. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, May 11, 2018 at 9:15 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Not quite! You missed one, just because comprehensions aren't weird enough yet. AFAIK you can't tell with the list comp, but with the genexp you can (by not iterating over it).
It's actually this: def _genexp(_outermost_iter): for x in _outermost_iter: yield x _result = _genexp(iter(_outermost_iter)) I don't think there's anything in the main documentation that actually says this, although PEP 289 mentions it in the detaily bits. [1] ChrisA [1] https://www.python.org/dev/peps/pep-0289/#the-details

(Note: this is an off-topic side thread, unrelated to assignment expressions. Inline comment below.) On Fri, May 11, 2018 at 9:08 AM, Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure this is the whole story. I tried to figure out how often __iter__ is called in a genexpr. I found that indeed I see iter() is called as soon as the generator is brought to life, but it is *not* called a second time the first time you call next(). However the translation you show has a 'for' loop which is supposed to call iter() again. So how is this done? It seems the generated bytecode isn't equivalent to a for-loop, it's equivalent to s while loop that just calls next(). Disassembly of a regular generator: def foo(a): for x in a: yield x * 2 0 SETUP_LOOP 18 (to 20)* 2 LOAD_FAST 0 (a) * 4 GET_ITER* >> 6 FOR_ITER 10 (to 18) 8 STORE_FAST 1 (x) 10 LOAD_FAST 1 (x) 12 YIELD_VALUE 14 POP_TOP 16 JUMP_ABSOLUTE 6 >> 18 POP_BLOCK >> 20 LOAD_CONST 0 (None) 22 RETURN_VALUE But for a generator: g = (x for x in C()) 1 0 LOAD_FAST 0 (.0) >> 2 FOR_ITER 10 (to 14) 4 STORE_FAST 1 (x) 6 LOAD_FAST 1 (x) 8 YIELD_VALUE 10 POP_TOP 12 JUMP_ABSOLUTE 2 >> 14 LOAD_CONST 0 (None) 16 RETURN_VALUE Note the lack of SETUP_LOOP and GET_ITER (but otherwise they are identical). -- --Guido van Rossum (python.org/~guido)

[Tim[
[Nick]
That's all well and good, but it is *completely insufficient for the language specification*.
I haven't been trying to write reference docs here, but so far as supplying a rigorous specification goes, I maintain the above gets "pretty close". It needs more words, and certainly isn't in the _style_ of Python's current reference docs, but that's all repairable. Don't dismiss it just because it's brief. Comprehensions already exist in the language, and so do nested scopes, so it's not necessary for this PEP to repeat any of the stuff that goes into those. Mostly it needs to specify the scopes of assignment expression target names - and the _intent_ here is really quite simple. Here with more words, restricted to the case of assignment expressions in comprehensions (the only case with any subtleties): Consider a name `y` appearing in the top level of a comprehension as an assignment expression target, where the comprehension is immediately contained in scope C, and the names belonging to scopes containing C have already been determined: ... (y := expression) ... We can ignore that `y` also appears as a `for` target at the comprehension's top level, because it was already decided that's a compile-time error. Consider what the scope of `y` would be if `(y := expression)` were textually replaced by `(y)`. Then what would the scope of `y` be? The answer relies solely on what the docs _already_ specify. There are three possible answers: 1. The docs say `y` belongs to scope S (which may be C itself, or a scope containing C). Then y's scope in the original comprehension is S. 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. 3. The docs are unclear about whether #1 or #2 applies. Then the language is _already_ ill-defined. It doesn't matter to this whether the assignment expression is, or is not, in the expression that defines the iterable for the outermost `for`. What about that is hand-wavy? Defining semantics clearly and unambiguously doesn't require specifying a concrete implementation (the latter is one possible way to achieve the goal - but _here_ it's a convoluted PITA because Python has no way to explicitly declare intended scopes). Since all questions about scope are reduced by the above to questions about Python's _current_ scope rules, it's as clear and unambiguous as Python's current scope rules. Now those may not be the _intended_ rules in all cases. That deserves deep scrutiny. But claiming it's too vague to scrutinize doesn't fly with me. If there's a scope question you suspect can't be answered by the above, or that the above gives an unintended answer to, by all means bring that up! If your question isn't about scope, then I'd probably view it as being irrelevant to the current PEP (e.g., what `locals()` returns depends on how the relevant code object attributes are set, which are in turn determined by which scopes names belong to relative to the code block's local scope, and it's certainly not _this_ PEP's job to redefine what `locals()` does with that info). Something to note: for-target names appearing in the outermost `for` _may_ have different scopes in different parts of the comprehension. y = 12 [y for y in range(y)] There the first two `y`'s have scope local to the comprehension, but the last `y` is local to the containing block. But an assignment expression target name always has the same scope within a comprehension. In that specific sense, their scope rules are "more elegant" than for-target names. This isn't a new rule, but a logical consequence of the scope-determining algorithm given above. It's a _conceptual_ consequence of that assignment statement targets are "intended to act like" the bindings are performed _in_ scope C rather than in the comprehension's scope. And that's no conceptually weirder than that it's _already_ the case that the expression defining the iterable of the outermost `for` _is_ evaluated in scope C (which I'm not a fan of, but which is rhetorically convenient to mention here ;-) ). As I've said more than once already, I don't know whether this should apply to comprehensions at class scope too - I've never used a comprehension in class scope, and doubt I ever will. Without use cases I'm familiar with, I have no idea what might be most useful there. Best uninformed guess is that the above makes decent sense at class scope too, especially given that I've picked up on that people are already baffled by some comprehension behavior at class scope. I suspect that you already know, but find it rhetorically convenient to pretend this is all so incredibly unclear you can't possibly guess ;-)
For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case"
Which are?
that you're attempting to hand wave away
Not attempting to wave them way - don't know what you're referring to. The proposed scope rules are defined entirely by straightforward reference to existing scope rules - and stripped of all the excess verbiage amount to no more than "same scope in the comprehension as in the containing scope".
should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-se...
Thanks for the link! I hadn't seen that before. If the PEP gets that far, I'd think harder about how it really "ought to be" documented. I think, e.g., that scope issues should be more rigorously handled in section 4.2 (which is about binding and name resolution).
What in the "more words" above was left to the implementation's discretion? I can already guess you don't _like_ the way it's worded, but that's not what I'm asking about.
As above, what does that have to do with PEP 572? The docs you referenced as a model don't even mention `locals()` - but PEP 572 must? Well, fine: from the explanation above, it's trivially deduced that all names appearing as assignment expression targets in comprehensions will appear as free variables in their code blocks, except for when they resolve to the global scope. In the former case, looks like `locals()` will return them, despite that they're _not_ local to the block. But that's the same thing `locals()` does for free variables created via any means whatsoever - it appears to add all the names in code_object.co_freevars to the returned dict. I have no idea why it acts that way, and wouldn't have done it that way myself. But if that's "a bug", it would be repaired for the PEP 572 cases at the same time and in the same way as for all other freevars cases. Again, the only thing at issue here is specifying intended scopes. There's nothing inherently unique about that..
That's an extreme characterization of what, in reality, is merely specifying scopes. That total = 0 sums = [total := total + value for value in data] blows up without the change is at least as confusing - and is more confusing to me.
I'm done arguing about this part ;-)
Right now, the learning process for picking up the details of comprehension scopes goes something like this:
Who needs to do this? I'm not denying that many people do, but is that a significant percentage of those who merely want to _use_ comprehensions? We already did lots of heroic stuff apparently attempting to cater to those who _don't_ want to learn about their implementation, like evaluating the outer iterable "at once" outside the comprehension scope, and - indeed - bothering to create a new scope for them at all. Look at the "total := total + value" example again and really try to pretend you don't know anything about the implementation. "It works!" is a happy experience :-) For the rest of this message, it's an entertaining and educational development. I'm not clear on what it has to do with the PEP, though.
I don't see any of those Python workalike examples in the docs. So which "status quo" are you referring to? You already know it's possible, and indeed straightforward, to write functions that model the proposed scope rules in any given case, so what;s your real point? They're "just like" the stuff above, possibly adding a sprinkling of "nonlocal" and/or "global" declarations. They don't require changing anything fundamental about the workalike examples you've already given - just adding cruft to specify scopes. I don't want to bother doing it here, because it's just tedious, and you _already know_ it. Most tediously, because there's no explicit way to declare a non-global scope in Python, in the """ 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. """ case it's necessary to do something like: if 0: y = None in the scope containing the synthetic function so that the contained "nonlocal y" declaration knows which scope `y` is intended to live in. (The "if 0:" block is optimized out of existence, but after the compiler has noticed the local assignment to `y` and so records that `y` is containing-scope-local.) Crap like that isn't really illuminating.
That's artificial silliness, though. Already suggested that Python repair one of its historical scope distinctions by teaching `nonlocal` that nonlocal x in a top-level function is a synonym for global x in a top-level function. In every relevant conceptual sense, the module scope _is_ the top-level lexical scope. It seems pointlessly pedantic to me to insist that `nonlocal` _only_ refer to a non-global enclosing lexical scope. Who cares? The user-level semantically important part is "containing scope", not "is implemented by a cell object". In the meantime, BFD. So long as the language keyword insists on making that distinction, ya, it's a distinction that needs to be made by users too (and by the compiler regardless). This isn't some inherently new burden for the compiler either. When it sees a not-local name in a function, it already has to figure out whether to reference a cell or pump out a LOAD_GLOBAL opcode.
* the desired semantics at class scope have been left largely unclear
Covered before. Someone who knows something about _desired_ class scope behavior needs to look at that. That's not me.
* the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear
See the "more words" version above. It implies that scopes need to be resolved "outside in" for nesting of any kind. Which they need to be anyway, e.g., to make the "is this not-local name a cell or a global?" distinction in any kind of function code.
Sorry, I don't know what that means. I don't even know what "compiler declaration" alone means. Regardless, there's nothing here that can't be explained easily enough by utterly vanilla lexically nested scopes. All the apparent difficulties stem from the inability to explicitly declare a name's intended scope, and that the "nonlocal" keyword in a top-level function currently refuses to acknowledge that the global scope _is_ the containing not-local scope. If you mean adding a new statement to Python parentlocal NAME ... sure, that could work. But it obscures that the problem just isn't hard enough to require such excessive novelty in Python's scope gimmicks. The correct place to declare NAME's scope is _in_ NAME's intended scope, the same as in every other language with lexical scoping. There's also that the plain English meaning of "parent local' only applies to rule #2 at the top, and to the proper subset of cases in rule #1 where it turns out that S is C. In the other rule #1 cases, "parentlocal" would be a misleading name for the less specific "nonlocal" or the more specific "global". Writing workalike functions by hand isn't difficult regardless, just tedious (even without the current proposal!), and I don't view it as a significant use case regardless. I expect the minority who do it have real fun with it for a day or two, and then quite possibly never again. Which is a fair summary of my own life ;-)
So you finally admit they _are_ relatively easy to define ;-) What, specifically, _are_ "*all" of the existing scope types"? There are only module, class, and function scopes in my view of the world. (and "comprehension scope" is just a name given at obvious times to function scope in my view of the world). If you also want piles of words about, e.g., how PEP 572 acts in all cases in smaller blocks, like code typed at a shell, or strings passed to eval() or exec(), you'll first have to explain why this was never necessary for any previous feature. PS: I hope you appreciate that I didn't whine about microscopic differences in the workalike examples' generated byte code ;-)

[Tim]
Something related to ponder: what's the meaning of the following _without_ the proposed scope change? So the Golden Binding Rule (GBR) applies then: GBR: binding a name by any means always makes the name local to the block the binding appears in, unless the name is declared "global" or "nonlocal" in the block. def f(): ys = [y for _ in range(y := 5)] The second instance of `y` is local - but local to what? Since the range is evaluated _in_ f's scope, presumably that instance of `y` is local to `f`. What about the first instance of `y`? Is that _not_ local to the comprehension despite that the GBR insists it must be local to the comprehension? Or does it raise UnboundLocalError for consistency with the GBR, and "well, so just don't use any name in a comprehension that appears as an assignment expression target in the expression defining the iterable for the outermost `for` ". Or is it that despite that `range(y := 5)` is executed in f's scope, the _binding_ is actually performed in the comprehension's scope to a comprehension-local `y`, to both preserve GBR and avoid the UnboundLocalError? . But then what if `print(y)` is added after? If `range(y := 5)` really was executed in f's scope, surely that must print 5. Then what about [y for y in range(y := 5)] ? Now that there's another binding inside the comprehension establishing that `y` is local to the comprehension "for real", does that work fine and the rule changes to well, so just don't use any name in a comprehension that appears as an assignment expression target in the expression E defining the iterable for the outermost `for` - unless the name is _also_ used in a binding context in the comprehension outside of E too ? Or is that a compile-time error despite that the first 2 y's are now obviously comprehension-local and the final y obviously f-local? Or are assignment expressions disallowed in the expression defining the iterable for the outermost `for`, and both examples are compile-time errors? Talk about incoherent ;-) Under the proposed change, all instances of `y` are local to `f` in the first example, and the second example is a compile-time error for a _coherent_ reason (the ":=" binding implies "not local" for `y` - which has nothing to do with that it's in the outermost `for` -, but the "for y in" binding implies "local" for `y`).

Just showing an example of "by hand" code emulating nesting of comprehensions, with a highly dubious rebinding, in the inner comprehension, of an outer comprehension's local for-target. list(i + sum((i := i+1) + i for j in range(i)) for i in range(5)) I don't believe I have compelling use cases for nesting listcomps/genexps, so that's just made up to be an example of atrocious feature abuse :-) In the outer genexp, `i` is obviously local, as is `j` in the inner genexp. But the assignment expression in the inner genexp demands that `i` _there_ be not-local. To which scope does the inner `i` belong? To the same scope it would belong if `i := i+1` were replaced by `i`, which the docs today say is the outer genexp's scope. So that's what it is. Here's code to emulate all that, with a bit more to demonstrate that `i` and `j` in the scope containing that statement remain unchanged: The only "novelty" is that a `nonlocal` declaration is needed to establish an intended scope. def f(): i = 42 j = 53 def outer(it): def inner(it): nonlocal i for j in it: i = i+1 yield i for i in it: yield i + sum(inner(range(i))) + i print(list(outer(range(5)))) print(i, j) f() The output: [0, 5, 13, 24, 38] 42 53 Since the code is senseless, so is the list it generates ;-) Showing it this way may make it clearer: [0+(0)+0, 1+(2)+2, 2+(3+4)+4, 3+(4+5+6)+6, 4+(5+6+7+8)+8]

Ah, fudge - I pasted in the wrong "high-level" code. Sorry! The code that's actually being emulated is not
list(i + sum((i := i+1) + i for j in range(i)) for i in range(5))
but list(i + sum((i := i+1) for j in range(i)) + i for i in range(5))
...
I have piles of these, but they're all equally tedious so I'll stop with this one ;-)

[Nick]
That's all well and good, but it is *completely insufficient for the language specification*.
And if you didn't like those words, you're _really_ gonna hate this ;-) I don't believe more than just the following is actually necessary, although much more than this would be helpful. I spent the first 15-plus years of my career writing compilers for a living, so am sadly resigned to the "say the minimum necessary for a long argument to conclude that it really was the minimum necessary" style of language specs. That's why I exult in giving explanations and examples that might actually be illuminating - it's not because I _can't_ be cryptically terse ;-) Section 4.2.1 (Binding of names) of the Language Reference Manual has a paragraph starting with "The following constructs bind names:". It really only needs another two-sentence paragraph after that to capture all of the PEP's intended scope semantics (including my suggestion): """ An assignment expression binds the target, except in a function F synthesized to implement a list comprehension or generator expression (see XXX). In the latter case, if the target is not in F's environment (see section 4.2.2) , the target is bound in the block containing F. """ That explicitly restates my earlier "rule #2" in the language already used by the manual. My "rule #1" essentially vanishes as such, because it's subsumed by what the manual already means by "F's environment". This may also be the best place to add another new sentence.: """ Regardless, if the target also appears as an identifier target of a `for` loop header in F, a `SyntaxError` exception is raised. """ Earlier, for now-necessary disambiguation, I expect that in ... targets that are identifiers if occurring in an assignment, ... " statement" should be inserted before the comma.

[Tim, suggests changes to the Reference Manual's 4.2.1]
Let me try that again ;-) The notion of "environment" includes the global scope, but that's not really wanted here. "Environment" has more of a runtime flavor anyway. And since nobody will tell me anything about class scope, I read the docs myself ;-) And that's a problem, I think! If a comprehension C is in class scope S, apparently the class locals are _not_ in C's environment. Since C doesn't even have read access to S's locals, it seems to me bizarre that ":=" could _create_ a local in S. Since I personally couldn't care less about running comprehensions of any kind at class scope, I propose to make `:=` a SyntaxError if someone tries to use a comprehension with ':=' at class scope (of course they may be able to use ":=" in nested comprehensions anyway - not that anyone would). If someone objects to that, fine, you figure it out ;-) So here's another stab. """ An assignment expression binds the target, except in a function F synthesized to implement a list comprehension or generator expression (see XXX). In the latter case[1]: - If the target also appears as an identifier target of a `for` loop header in F, a `SyntaxError` exception is raised. - If the block containing F is a class block, a `SyntaxError` exception is raised. - If the target is not local to any function enclosing F, and is not declared `global` in the block containing F, then the target is bound in the block containing F. Footnote: [1] The intent is that runtime binding of the target occurs as if the binding were performed in the block containing F. Because that necessarily makes the target not local in F, it's an error if the target also appears in a `for` loop header, which is a local binding for the same target. If the containing block is a class block, F has no access to that block's scope, so it doesn't make sense to consider the containing block. If the target is already known to the containing block, the target inherits its scope resolution from the containing block. Else the target is established as local to the containing block. """ I realize the docs don't generally use bullet lists. Convert to WallOfText if you must. The material in the footnote would usually go in a "Rationale" doc instead, but we don't have one of those, and I think the intent is too hard to deduce without that info. And repeating the other point, to keep a self-contained account:

On 05/12/2018 11:41 PM, Tim Peters wrote:
Python 3.7.0b3+ (heads/bpo-33217-dirty:28c1790, Apr 5 2018, 13:10:10) [GCC 4.8.2] on linux Type "help", "copyright", "credits" or "license" for more information. --> class C: ... huh = 7 ... hah = [i for i in range(huh)] ... --> C.hah [0, 1, 2, 3, 4, 5, 6] Same results clear back to 3.3 (the oldest version of 3 I have). Are the docs wrong? Or maybe they just refer to functions: --> class C: ... huh = 7 ... hah = [i for i in range(huh)] ... heh = lambda: [i for i in range(huh)] ... --> C.hah [0, 1, 2, 3, 4, 5, 6] --> C.heh() Traceback (most recent call last): File "test_class_comp.py", line 7, in <module> print(C.heh()) File "test_class_comp.py", line 4, in <lambda> heh = lambda: [i for i in range(huh)] NameError: global name 'huh' is not defined So a class-scope comprehension assignment expression should behave as you originally specified. -- ~Ethan~

[Tim[
[Ethan Furman <ethan@stoneleaf.us>]
As Chris already explained (thanks!), the expression defining the iterable for the outermost `for` (which, perhaps confusingly, is the _leftmost_ `for`) is treated specially in a comprehension (or genexp), evaluated at once _in_ the scope containing the comprehension, not in the comprehension's own scope. Everything else in the comprehension is evaluated in the comprehension's scope. I just want to add that it's really the same thing as your lambda example. Comprehensions are also implemented as lambdas (functions), but invisible functions created by magic. The synthesized function takes one argument, which is the expression defining the iterable for the outermost `for`. So, skipping irrelevant-to-the-point details, your original example is more like: class C: huh = 7 def _magic(it): return [i for i in it] hah = _magic(range(huh)) Since the `range(huh)` part is evaluated _in_ C's scope, no problem. For a case that blows up, as Chris did you can add another `for` as "outermost", or just try to reference a class local in the body of the comprehension: class C2: huh = 7 hah = [huh for i in range(5)] That blows up (NameError on `huh`) for the same reason your lambda example blows up, because it's implemented like: class C: huh = 7 def _magic(it): return [huh for i in it] hah = _magic(range(5)) and C's locals are not in the environment seen by any function called from C's scope. A primary intent of the proposed ":= in comprehensions" change is that you _don't_ have to learn this much about implementation cruft to guess what a comprehension will do when it contains an assignment expression. The intent of total = 0 sums = [total := total + value for value in data] is obvious - until you think too much about it ;-) Because there's no function in sight, there's no reason to guess that the `total` in `total = 0` has nothing to do with the instances of `total` inside the comprehension. The point of the change is to make them all refer to the same thing, as they already do in (the syntactically similar, but useless): total = 0 sums = [total == total + value for value in data] Except even _that_ doesn't work "as visually expected" in class scope today. The `total` inside the comprehension refers to the closest (if any) scope (_containing_ the `class` statement) in which `total` is local (usually the module scope, but may be a function scope if the `class` is inside nested functions). In function and module scopes, the second `total` example does work in "the obvious" way, so in those scopes I'd like to see the first `total` example do so too.

[Tim]
FYI, that's still not right, but I've been distracted by trying to convince myself that the manual actually defines what happens when absurdly deeply nested functions mix local values for a name at some levels with a `global` declaration of the name at other levels. I suspect that the above should be reworded to the simpler: - If the target is not declared `global` or `nonlocal` in the block containing F, then the target is bound in the block containing F. That makes "intuitive sense" because if the target is declared `global` or `nonlocal` the meaning of binding in the block is already defined to affect a not-local scope, while if it's not declared at all then binding in the block "should" establish that it's local.to the block (regardless of how containing scopes treat the same name) But whether that all follows from what the manual already says requires more staring at it ;-) Regardless, if anyone were to point it out, I'd agree that it _should_ count against this that establishing which names are local to a block may require searching top-level comprehensions in the block for assignment expressions. On a scale of minus a million to plus a million, I'd only weight that in the negative thousands, though ;-)

[Nick Coghlan <ncoghlan@gmail.com> ]
I'm most interested in what sensible programmers can do easily that's of use, not really about pathologies that can be contrived.
Sure.
# This is not (given the "let's reintroduce leaking from comprehensions" proposal)
Be fair: it's not _re_introducing anything. It's brand new syntax for which "it's a very much intended feature" that a not-local name can be bound. You have to go out of your way to use it. Where it doesn't do what you want, don't use it.
p = 0
I'm not sure of the intent of that line. If `p` is otherwise unknown in this block, its appearance as a binding operator target in an immediately contained genexp establishes that `p` is local to this block. So `p = 0` here just establishes that directly. Best I can guess, the 0 value is never used below.
gen1 = (p := q for q in range(10))
I expect that's a compile time error, grouping as gen1 = (p := (q for q in range(10))) but without those explicit parentheses delimiting the "genexp part" it may not be _recognized_ as being a genexp. With the extra parens, it binds both `gen1` and `p` to the genexp, and `p` doesn't appear in the body of the genexp at all. Or did you intend gen1 = ((p := q) for q in range(10)) ? I'll assume that's so.
gen2 = (p, p := q for q in gen1)
OK, I really have no guess about the intent there. Note that gen2 = (p, q for q in gen1) is a syntax error today, while gen2 = (p, (q for q in gen1)) builds a 2-tuple. Perhaps gen2 = ((p, p := q) for q in gen1) was intended? Summarizing: gen1 = ((p := q) for q in range(10)) gen2 = ((p, p := q) for q in gen1) is my best guess.
print(list(gen2))
[(0, 0), (1, 1), (2, 2), ..., (9, 9)] But let's not pretend it's impossible to do that today; e.g., this code produces the same: class Cell: def __init__(self, value=None): self.bind(value) def bind(self, value): self.value = value return value p = Cell() gen1 = (p.bind(q) for q in range(10)) gen2 = ((p.value, p.bind(q)) for q in gen1) print(list(gen2)) Someone using ":=" INTENDS to bind the name, just as much as someone deliberately using that `Cell` class.
I'm not clear on what "This is fine" means, other than that the code does whatever it does. That's part of why I so strongly prefer real-life use cases. In the code above, I can't imagine what the intent of the code might be _unless_ they're running tons of otherwise-useless code for _side effects_ performed by calling `related_coords()`. If "it's functional", they could do the same via x = y = 9 transposed_related_coords = [y, x for x, y in related_coords(x, y)] except that's a syntax error ;-) I assume transposed_related_coords = [(y, x) for x, y in related_coords(x, y)] was intended. BTW, I'd shoot anyone who tried to check in that code today ;-) It inherently relies on that the name `x` inside the listcomp refers to two entirely different scopes, and that's Poor Practice (the `x` in the `related_coords()` call refers to the `x` in `for x in range(10)`, but all other instances of `x` refer to the listcomp-local `x`).
Same syntax error there (you need parens around "x, y" at the start of the listcomp). Presumably they _intended_ to build (x, f(x)) pairs when and only when `f(x)` "is interesting". In what specific way does the code fail to do that? Yes, the outer `y` is rebound, but what of it? When the statement completes, `y` will be rebound to the next value from the inner range(10), and that's the value of `y` seen by `related_x_coord(x, y)` the next time the loop body runs. The binding done by `:=` is irrelevant to that. So I don't see your point in that specific example, although - sure! - of course it's possible to contrive examples where it really would matter. For example, change the above in some way to use `x` as the binding operator target inside the listcomp. Then that _could_ affect the value of `x` seen by `related_x_coord(x, y)` across inner loop iterations.
Deliberately reintroducing stateful side effects into a nominally functional construct seems like a recipe for significant confusion,
Side effects of any kind anywhere can create significant confusion. But Python is not a functional language, and it you don't want side effects due to ":=" in synthetic functions, you're not required to use ":=" in that context. That said, I agree "it would be nice" if advanced users had a way to explicitly say which scope they want.
even if there are some cases where it might arguably be useful to folks that don't want to write a named function that returns multiple values instead.
Sorry, I didn't follow that - functions returning multiple values?

On 5/7/2018 1:38 PM, Guido van Rossum wrote:
If I am understanding correctly, this would also let one *intentionally 'leak' (export) the last value of the loop variable when wanted. [math.log(xlast:=x) for x in it if x > 0] print(xlast)
This is a special case of the fact that no function called in class scope can access class variables, even if defined in the class scope.
Traceback (most recent call last): File "<pyshell#5>", line 1, in <module> class C: File "<pyshell#5>", line 5, in C z = f() File "<pyshell#5>", line 4, in f return x NameError: name 'x' is not defined I would find it strange if only functions defined by a comprehension were given new class scope access.
To me, this is the prime justification for the 3.0 comprehension change. I currently see a comprehension as a specialized generator expression. A generator expression generalizes math set builder notation. If left 'raw', the implied function yields the values generated (what else could it do?). If a collection type is indicated by the fences and expression form, values are instead added to an anonymous instance thereof. -- Terry Jan Reedy

On Mon, May 07, 2018 at 10:38:09AM -0700, Guido van Rossum wrote:
It doesn't get my hackles up as much as you, but its not really what I want. It's just a compromise between what I *don't* want (1), which fails to solve the original motivating example that started this discussion, and what Chris was pushing back against (2).
+1 Whether the current class behaviour is "broken" or desirable or somewhere in between, it is what we have now and its okay if binding expressions have the same behaviour. -- Steve

yes. I have some probably tangential to bad arguments but I'm going to make them anyways, because I think := makes the most sense along with SLNB. first, := vs post-hoc (e.g. where or given) base case: [ x for x in range(1) ] while obvious to all of us, reading left to right does not yield what x is till later. [ (x, y) for x in range(1) for y in range(1) ] doubly so. If x or y were defined above, it would not be clear until the right end if what contex they had. [ (x, y) for x in range(n) given y = f(n) ] i dont know what's the iterator till after 'for' [ (x, y:=f(n) for x in range(n) ] At a minimum, I learn immediately that y is not the iterator. Slightly less cognitive load. it's not that one is better, or that either is unfamiliar, it's about having to hold a "promise" in my working memory, vs getting an immediate assignment earlier. (it's a metric!) now my silly argument. ":" is like a "when" operator. if y==x:

On May 6, 2018 8:41:26 PM Tim Peters <tim.peters@gmail.com> wrote:
Couldn't you just do: def first(it): return next(it, None) while (item := first(p for p in small_primes if n % p == 0)): # ... IMO for pretty much anything more complex, it should probably be a loop in its own function.
-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

[Tim]
[Ryan Gonzalez <rymg19@gmail.com>]
In the "different thread" I mentioned above, I already noted that kind of spelling. I'm not at a loss to think of many ways to spell it ;-) The point of this thread was intended to be about the semantics of binding expressions in comprehensions. For that purpose, the PEP noting that total = 0 progressive_sums = [total := total + value for value in data] fails too is equally relevant. Of course there are many possible ways to rewrite that too that would work. That doesn't change that the failing attempts "look like they should work", but don't, but could if the semantics of ":=" were defined differently inside magically-created anonymous lexically nested functions

On 7 May 2018 at 11:32, Tim Peters <tim.peters@gmail.com> wrote:
You have the reasoning there backwards: implicitly nested scopes behave like explicitly nested scopes because that was the *easy* way for me to implement them in Python 3.0 (since I got to re-use all the pre-existing compile time and runtime machinery that was built to handle explicit lexical scopes). Everything else I tried (including any suggestions made by others on the py3k mailing list when I discussed the problems I was encountering) ran into weird corner cases at either compile time or run time, so I eventually gave up and proposed that the implicit scope using to hide the iteration variable name binding be a full nested closure, and we'd just live with the consequences of that. The sublocal scoping proposal in the earlier drafts of PEP 572 was our first serious attempt at defining a different way of doing things that would allow names to be hidden from surrounding code while still being visible in nested suites, and it broke people's brains to the point where Guido explicitly asked Chris to take it out of the PEP :) However, something I *have* been wondering is whether or not it might make sense to allow inline scoping declarations in comprehension name bindings. Then your example could be written: def ...: p = None while any(n % p for nonlocal p in small_primes): # p was declared as nonlocal in the nested scope, so our p points to the last bound value Needing to switch from "nonlocal p" to "global p" at module level would likely be slightly annoying, but also a reminder that the bound name is now visible as a module attribute. If any other form of comprehension level name binding does eventually get accepted, then inline scope declarations could similarly be used to hoist values out into the surrounding scope: rem = None while any((nonlocal rem := n % p) for nonlocal p in small_primes): # p and rem were declared as nonlocal in the nested scope, so our rem and p point to the last bound value Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 7 May 2018 at 12:51, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thinking about it a little further, I suspect the parser would reject "nonlocal name := ..." as creating a parsing ambiguity at statement level (where it would conflict with the regular nonlocal declaration statement). The extra keyword in the given clause would avoid that ambiguity problem: p = rem = None while any(rem for nonlocal p in small_primes given nonlocal rem = n % p): # p and rem were declared as nonlocal in the nested scope, so our p and rem refer to their last bound values Such a feature could also be used to make augmented assignments do something useful at comprehension scope: input_tally = 0 process_inputs(x for x in input_iter given nonlocal input_tally += x) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Tim]
[Nick Coghlan <ncoghlan@gmail.com>]
You have the reasoning there backwards:
That's easy to believe - I also had a long history of resisting nested scopes at all ;-)
It's unfortunate that there are "consequences" at all. That kind of thing is done all the time in Lisp-ish languages, but they require explicitly declaring names' scopes. Python's "infer scope instead by looking for bindings" worked great when it had 3 scopes total, but keeps forcing "consequences" that may or may not be desired in a generally-nested-scopes world.
To which removal I was sympathetic, BTW.
Which more directly addresses the underlying problem: not really "binding expressions" per se, but the lack of control over scope decisions in comprehensions period. It's not at all that nested scopes are a poor model, it's that we have no control over what's local _to_ nested scopes the language creates. I'd say that's worth addressing in its own right, regardless of PEP 572's fate. BTW, the "p = None" there is annoying too ;-)
Or `nonlocal` could be taught that its use one level below `global` has an obvious meaning: global.
Right - as above, inline scope declarations would be applicable to all forms of comprehension-generated code. And to any other future construct that creates lexically nested functions.

On 2018-05-06 18:32, Tim Peters wrote:
I agree that is a limitation, and I see from a later message in the thread that Guido finds it compelling, but personally I don't find that that particular case such a showstopper that it would tip the scales for me either way. If you have to write the workalike look that iterates and returns the missing value, so be it. That's not a big deal. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

[Tim]
[Brendan Barnwell <brenbarn@brenbarn.net>]
Guido didn't find it compelling: for that specific example to show `p` would require that for-loop targets "leak", and he remains opposed to that. I don't want that changed either. The issue instead is what the brand-new proposed ":=" should do, which isn't used in that example at all. Whether that specific example can be written in 500 other ways (of course it can) isn't really relevant. One of the ironies already noted is that PEP 572 gives an example of something that _won't_ work ("progressive_sums") which happens to be the very use case that started the current debate about assignment expressions to begin with. That raises the very same issue about ":=" that "the obvious" rewrite of my example at the top raises. Which suggests to me (& apparently to Guido too) that there may be a real issue here worth addressing. There are many use cases for binding expressions outside of synthetically generated functions. For PEP 572, it's the totality that will be judged, not just how they might work inside list comprehensions and generator expressions (the only topics in _this_ particular thread), let alone how they work in one specific example.

On Mon, May 7, 2018 at 11:32 AM, Tim Peters <tim.peters@gmail.com> wrote:
You're correct. The genexp is approximately equivalent to: def genexp(): for p in small_primes: thisp = p yield n % thisp == 0 while any(genexp()): n //= thisp With generator expressions, since they won't necessarily be iterated over immediately, I think it's correct to create an actual nested function; you need the effects of closures. With comprehensions, it's less obvious, and what you're asking for might be more plausible. The question is, how important is the parallel between list(x for x in iter) and [x for x in iter] ? Guido likes it, and to create that parallel, list comps MUST be in their own functions too.
That's a fair point. But there is another equally valid use-case for assignment expressions inside list comps: values = [y + 2 for x in iter if (y := f(x)) > 0] In this case, it's just as obvious that the name 'y' should be local to the comprehension, as 'x' is. Since there's no way to declare "nonlocal y" inside the comprehension, you're left with a small handful of options: 1) All names inside list comprehensions are common with their surrounding scope. The comprehension isn't inside a function, the iteration variable leaks, you can share names easily. Or if it *is* inside a function, all its names are implicitly "nonlocal" (in which case there's not much point having the function). 2) All names are local to their own scope. No names leak, and that includes names made with ":=". 3) Some sort of rule like "iteration variables don't leak, but those used with := are implicitly nonlocal". Would create odd edge cases eg [x for x in iter if x := x] and that would probably result in x leaking. 4) A special adornment on local names if you don't want them to leak 5) A special adornment on local names if you DO want them to leak 6) A combination of #3 and #4: "x := expr" will be nonlocal, ".x := expr" will be local, "for x in iter" will be local. Backward compatible but a pain to explain. I can't say I'm a fan of any of the complicated ones (3 through 6). Option #2 is current status - the name binding is part of the expression, the expression is inside an implicit function, so the name is bound within the function. Option 1 is plausible, but would be a backward compatibility break, with all the consequences thereof. It'd also be hard to implement cleanly with genexps, since they MUST be functions. (Unless they're an entirely new concept of callable block that doesn't include its own scope, which could work, but would be a boatload of new functionality.)
Is it really more useful more often?
Personally, I'd still like to go back to := creating a statement-local name, one that won't leak out of ANY statement. But the tide was against that one, so I gave up on it. ChrisA

[Chris Angelico <rosuav@gmail.com>]
I don't care how they're implemented here; I only care here about the visible semantics.
There's a difference, though: if `y` "leaks", BFD. Who cares? ;-) If `y` remains inaccessible, there's no way around that.
Since there's no way to declare "nonlocal y" inside the comprehension, you're left with a small handful of options:
i leapt straight to #3:
DOA. Breaks old code.
2) All names are local to their own scope. No names leak, and that includes names made with ":=".
Saying "local to their own scope" _assumes_ what you're trying to argue _for_ - it's circular. In fact it's impossible to know what the user intends the scope to be.
3) Some sort of rule like "iteration variables don't leak, but those used with := are implicitly nonlocal".
Explicitly, because "LHS inherits scope from its context" (whether global, nonlocal, or local) is part of what ":=" is defined to _mean_ then.
Would create odd edge cases eg [x for x in iter if x := x] and that would probably result in x leaking.
Don't care.
4) A special adornment on local names if you don't want them to leak
5) A special adornment on local names if you DO want them to leak
Probably also DOA.
Definitely DOA. ...
Is it really more useful more often?
I found no comprehensions of any kind in my code where binding expressions would actually be of use unless the name "leaked". Other code bases may, of course, yield different guesses. I'm not a "cram a lot of stuff on each line" kind of coder. But the point above remains: if they don't leak, contexts that want them to leak have no recourse. If they do leak, then the other uses would still work fine, but they'd possibly be annoyed by a leak they didn't want.
Part of that is because - as the existence of this thread attests to - we can't even control all the scopes gimmicks Python already has. So people are understandably terrified of adding even more ;-)

On Mon, May 7, 2018 at 12:34 PM, Tim Peters <tim.peters@gmail.com> wrote:
That's Steve D'Aprano's view - why not just let them ALL leak? I don't like it though.
Sorry, I meant "local to the comprehension's scope". We can't know the user's intention. We have to create semantics before the user's intention even exists.
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation: stuff = [x + 1 for x in iter if x % 3] stuff = [] for x in iter: if x % 3: stuff.append(x + 1) Now 'x' leaks as well, and it's more consistent with how people explain comprehensions. Is that a good thing? I don't think so. Having the iteration variable NOT leak means it's a self-contained unit that simply says "that thing we're iterating over".
Part of it is just that people seem to be fighting for the sake of fighting. I'm weary of it, and I'm not going to debate this point with you. You want 'em to leak? No problem. Implement it that way and I'm not going to argue it. ChrisA

[Tim]
There's a difference, though: if `y` "leaks", BFD. Who cares? ;-) If `y` remains inaccessible, there's no way around that.
[Chris]
That's Steve D'Aprano's view - why not just let them ALL leak? I don't like it though.
I didn't suggest that. I'm not suggesting changing _any_ existing behavior (quite the contrary). Since ":=" would be brand new, there is no existing behavior for it.
Exactly. That's why I would like ":-=" to be defined from the start in a way that does least damage ;-)
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation:
Again, I'm not suggesting changing any existing behavior.
It's fine by me that for-target names don't leak. I didn't suggest changing that.
I'm more interested in real-life use cases than in arguments. My suggestion came from staring at my real-life use cases, where binding expressions in comprehensions would clearly be more useful if the names bound leaked. Nearly (but not all) of the time,, they're quite happy with that for-target names don't leak. Those are matters of observation rather than of argument.

On 7 May 2018 at 13:15, Tim Peters <tim.peters@gmail.com> wrote:
The issue is that because name binding expressions are just ordinary expressions, they can't be defined as "in comprehension scope they do X, in other scopes they do Y" - they have to have consistent scoping semantics regardless of where they appear. However, it occurs to me that a nonlocal declaration clause could be allowed in comprehension syntax, regardless of how any nested name bindings are spelt: p = rem = None while any((rem := n % p) for p in small_primes nonlocal (p, rem)): # p and rem were declared as nonlocal in the nested scope, so our rem and p point to the last bound value I don't really like that though, since it doesn't read as nicely as being able to put the nonlocal declaration inline. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Nick Coghlan <ncoghlan@gmail.com>]
While I'm not generally a fan of arguments, I have to concede that's a really good argument :-) Of course their definition _could_ be context-dependent, but even I'll agree they shouldn't be. Never mind!
If the idea gets traction, I'm sure we'll see 100 other syntax ideas by the time I wake up again.

For what it's worth, i'm totally +1 on inline uses of global and nonlocal. As a related improvement, i'd also like it if "global x = 5" would be a legal statement. As a noob learning python, I was suprised to find out I couldn't and had to split it on two lines.(aside from a 9-hour course of C and some labview (which I totally hate), python was my first language and still the one im by far most proficient with.) 2018-05-07 6:04 GMT+02:00 Tim Peters <tim.peters@gmail.com>:

On Mon, May 07, 2018 at 12:48:53PM +1000, Chris Angelico wrote:
On Mon, May 7, 2018 at 12:34 PM, Tim Peters <tim.peters@gmail.com> wrote:
I know popular opinion is against me, and backward compatibility and all that, but I wish that generator expressions and comprehensions ran in their surrounding scope, like regular for statements. (Yes, I know that makes generator expressions tricky to implement. As the guy who doesn't have to implement it, I don't have to care :-) Calling it a "leak" assumes that it is a bad thing. I don't think it is a bad thing. It's not often that I want to check the value of a comprehension loop, but when I do, I have to tear the comprehension apart into a for-loop. Even if it is only temporarily, for debugging, then put the comprehension back together. The only time I can see it is a bad thing is if I blindly copy and paste a comprehension out of one piece of code and dump it into another piece of code without checking to see that it slots in nicely without blowing away existing variables. But if you're in the habit of blindly and carelessly pasting into your code base without looking it over, this is probably the least of your worries... *wink* But what's done is done, and while there are plenty of windmills I am willing to tilt at, reversing the comprehensions scope decision is not one of them. [...]
Surely that's backwards? We ought to find out what people want before telling them that they can't have it :-)
Indeed.
Then let's revert the Py3 change that put comprehensions into functions, and put them back to the vanilla transformation:
You know we can't do that. But we do have a choice with binding expressions. *Either way*, whatever we do, we're going to upset somebody, so we simply have to decide who that will be. Actually we have at least three choices: (1) Consistency Über Alles (whether foolish or not) Now that comprehensions are their own scope, be consistent about it. Binding expressions inside the comprehension will be contained to the comprehension. I'll hate it, but at least it is consistent and easy to remember: the entities which create a new scope are modules, classes, functions, plus comprehensions. That's going to cut out at least one motivating example though. See below. (2) Binding assignments are *defined* as "leaking", or as I prefer, defined as existing in the lexical scope that contains the comprehension. Hence: # module level [(x := a) for a in [98, 99]] assert x == 99 # class level class X: [(x := a) for a in [98, 99]] assert X.x == 99 # function level def func(): [(x := a) for a in [98, 99]] assert x == 99 Note that in *all* of these cases, the variable a does not "leak". This will helpfully support the "running total" use-case that began this whole saga: total = 0 running_totals = [(total := total + x) for x in [98, 99]] assert total == 197 (I have to say, given that this was THE motivating use-case that began this discussion, I think it is ironic and not a little sad that the PEP has evolved in a direction that leaves this use-case unsatisfied.) (3) A compromise: binding assignments are scoped local to the comprehension, but they are initialised from their surrounding scope. This would be similar to the way Lua works, as well as function parameter defaults. I have some vague ideas about implementation, but there's no point discussing that unless people actually are interested in this option. This will *half* satisfy the running-total example: total = 0 running_totals = [(total := total + x) for x in [98, 99]] assert total == 0 Guaranteed to generate at least two Stackoverflow posts a month complaining about it, but better than nothing :-)
Assuming there are no side-effects to any of the operations inside the comprehension.
Part of it is just that people seem to be fighting for the sake of fighting.
Them's fightin' words! *wink* Honestly Chris, I know this must be frustrating, but I'm not fighting for the sake of it, and I doubt Tim is either. I'm arguing because there are real use-cases which remain unmet if binding-variables inside comprehensions are confined to the comprehension. -- Steve

On Mon, May 7, 2018 at 9:42 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yeah, it's really easy when you don't have to worry about how on earth you can implement the concept of "unexecuted block of code that can be executed later even after the surrounding context has returned, but which isn't a function". :)
Does it HAVE to be initialised from the surrounding scope? What if the surrounding scope doesn't have that variable? stuff = [spam for x in items if (spam := f(x)) < 0] Is this going to NameError if you haven't defined spam? Or is the compiler to somehow figure out whether or not to pull in a value? Never mind about implementation - what are the semantics?
Which is often the case.
I don't think you are and I don't think Tim is. But do you honestly want to say that about EVERY person in these threads? ChrisA

On Mon, May 07, 2018 at 10:38:51PM +1000, Chris Angelico wrote:
No. Then it's just an uninitialised local, and it is a NameError to try to evaluate it before something gets bound to it.
stuff = [spam for x in items if (spam := f(x)) < 0]
Is this going to NameError if you haven't defined spam?
It shouldn't be an error, because by the time the comprehension looks up the value of "spam", it has been bound by the binding-expression.
Variable "spam" is not defined in any surrounding scope, so these ought to all be NameError (or UnboundLocalError): [spam for a in it] # obviously [(spam := spam + a) for a in it] [spam if True else (spam := a) for a in it] [spam for a in it if True or (spam := a)] They are errors because the name "spam" is unbound when you do a lookup. This is not an error, because the name is never looked up: [True or spam for a in it if True or (spam := a)] Although "spam" never gets bound, neither does it get looked up, so no error. The next one is also okay, because "spam" gets bound before the first lookup: [(spam := spam+1) for a in it if (spam := a*2) > 0] Here's a sketch of how I think locals are currently handled: 1. When a function is compiled, the compiler does a pass over the source and determines which locals are needed. 2. The compiler builds an array of slots, one for each local, and sets the initial value of the slot to "empty" (undefined). 3. When the function is called, if it tries reading from a local's slot which is still empty, it raises UnboundLocalError. (am I close?) Here's the change I would suggest: 2. The compiler builds an array of slots, one for each local: 2a. For locals that are the target of binding-expression only: - look up the target in the current scope (that is, not the comprehension's scope, but the scope that the comprehension is inside) using the normal lookup rules, as if you were compiling "lambda x=x: None" and needed the value of x; - if the target is undefined, then swallow the error and leave the slot as empty; - otherwise store a reference to that value in the slot. 2b. For all other locals, proceed as normal.
I'm going to assume good faith, no matter the evidence :-) -- Steve

On Tue, May 8, 2018 at 12:26 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Yeah, I'm pretty sure that's all correct.
It's easy when you're not implementing things. I'm going to just say "sure, go for it", and also not implement it. Have fun, whoever goes in and tries to actually do the work... I don't think there's any other construct in Python that can replicate "a thing or the absence of a thing" in that way. For instance, there's no way to copy a value from one dict into another, or delete from the target dict if it isn't in the source (other than a long-hand). I've no idea how it would be implemented, and don't feel like finding out. ChrisA

I am convinced by Tim's motivation. I hadn't thought of this use case before -- I had mostly thought "local scope in a comprehension or generator expression is the locals of the synthetic function". But Tim's reasoning feels right. The only solution that makes sense to me is Steven's (2). (1) is what the PEP currently says and what Tim doesn't want; (3) has no precedent (function defaults don't really work like this) and just gets my hackles all up. (I can't even tell if Steven meant it as a serious proposal.) So let's see if we can change PEP 572 so that := inside a comprehension or generator expression al ways assigns to a variable in the containing scope. It may be inconsistent with the scope of the loop control variable, but it's consistent with uses of := outside comprehensions: [x := 0] [x := i for i in range(1)] both have the side effect of setting x to zero. I like that. There's one corner case (again) -- class scopes. If the containing scope is a function, everything's fine, we can use the existing closure machinery. If the containing scope is global, everything's fine too, we can treat it as a global. But if the containing scope is a class, we can't easily extend the current machinery. But this breakage is similar to the existing breakage with comprehensions in class scope that reference class variables: class C: hosts = ['boring', 'haring', 'tering'] full_hosts = [host + suffix for suffix in ('.cwi.nl', '.com') for host in hosts] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in C File "<stdin>", line 3, in <listcomp> NameError: name 'hosts' is not defined I propose to punt on this case. If we want to fix it we can fix it in a number of ways and the fix can easily apply to both getting and setting -- but this is a separate fix (and we should take it out of PEP 572). PS1. The various proposals that add random extra keywords to the syntax (like 'for nonlocal i') don't appeal to me at all. PS2. IIRC the reason we gave loop control variables their own scope was the poor imagination of many people when it comes to choosing loop control variable names. We had seen just too many examples of for x in something: ...lots of code using x... blah blah [x+1 for x in something else] ...some more code using x, broken... It's true that this can also happen with a for-loop statement nested inside the outer loop (and it does) but the case of a comprehension was easier to miss. I've never looked back. PS3. Comprehensions and generator expressions should be interchangeable. They just look too similar to have different semantics (and the possibly delayed execution of generator expressions is not an issue -- it's rare, and closure semantics make it work just fine). -- --Guido van Rossum (python.org/~guido)

[Guido]
I'm trying very hard _not_ to reason. That is, I'm looking at code and trying to figure out what would actually work well, logic & consistency & preconceptions be damned. "Reasons" can be made up after the fact - which is how the world actually works regardless ;-)
He doesn't want (3) either. I can channel him on that.
So let's see if we can change PEP 572 so that := inside a comprehension or generator expression always assigns to a variable in the containing scope.
While I don't have real use cases beyond that, given that much, "consistency" kicks in to suggest that: def f(): [x := 42 for x in range(1)] makes `x` local to `f` despite that x wasn't bound elsewhere in f's body. def f(): global x [x := 42 for x in range(1)] binds the global `x`. def f(): nonlocal x [x := 42 for x in range(1)] binds `x` in the closest-containing scope in which `x` is local. The last two act as if the declaration of `x` in `f` were duplicated at the start of the synthetic function. More succinctly, that `x := RHS` in a synthetic function "act the same" as `x = RHS` appearing in the scope directly containing the synthetic function. Does that generalize to class scope too? I don't know. I never use fancy stuff in class scopes, and have no idea how they work anymore. So long as "def method(self, ...)" continues to work, I'm happy ;-)
And is what anyone would expect if they didn't think too much about it. ... [snipping stuff about class scope - nothing to add] ...
PS1. The various proposals that add random extra keywords to the syntax (like 'for nonlocal i') don't appeal to me at all.
They're appealing to the extent that "explicit is better than implicit" for people who actually understand how all this stuff is implemented. I don't know what percentage of Python programmers that is, though. I've certainly, e.g., seen many on Stackoverflow who can make a list comprehension work who couldn't distinguish a lexical scope from an avocado. The ":= is treated specially in comprehensions" idea is aimed more at them than at people who think invoking a synthetic anonymous lambda "is obvious".
I don't want to change anything about any of that - already believe Python struck the best balance possible.
Wholly agreed.

[Tim]
Oh, fudge - I wasn't trying to make a silly subtle point by reusing `x` as the `for` variable too. Pretend those all said "for i in range(1)" instead. Of course what happens if `x` is used in both places needs to be defined, but that's entirely beside the intended point _here_.

[Tim]
It occurs to me that, while suggestive, this is an unhelpful way to express it. It's not at all that the semantics of ":=" change inside a listcomp/genexp, it's that the latter's idea of intended _scopes_ for names is made more nuanced (inside a synthetic function created to implement a listcomp/genexp, names bound by "=" are local; names bound by ":=" are nonlocal; names bound by both are "who cares?"- compiler-time error would be fine by me, or the first person to show a real use case wins). Regardless, the runtime implementation of ":=" remains the same everywhere.

Wait, you can't use = in a listcomp, right ? Or are you talking about the implementation hidden to the casual user ? I thought letting := bind to the surrounding scope was fine basically because it's currently not possible, so therefore there would be no syntactic ambiguity, and it'd actually do what people would expect. Jacco

[Tim]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
Wait, you can't use = in a listcomp, right ? Or are you talking about the implementation hidden to the casual user ?
Sorry, I was too obscure there - I intended "=" to mean "name binding by any means other than :=". Off the top of my head, I believe that - today - the only "any means other than :=" possible in a listcomp/genexp is appearing as a target in a `for` clause (like the `i` in "[i+1 for i in iterable]`). If there's some other way I missed, I meant to cover that too. But, yes, you're right, `names bound by "="` makes no literal sense at all there.
It's not really about the semantics of `:=` so much as about how synthetic functions are defined. In most cases, it amounts to saying "in the nested function synthesized for a listcomp/genexp, if a name `x` appears as the target of a binding expression in the body, a `nonlocal x` declaration is generated near the top of the synthetic function". For example, if this block appears inside a function: it = (i for i in range(10)) total = 0 for psum in (total := total + value for value in it): print(psum} under the current PEP meaning it blows up in the same way this code blows up today: it = (i for i in range(10)) total = 0 def _func(it): for value in it: total = total + value # blows up here yield total for psum in _func(it): print(psum) with UnboundLocalError: local variable 'total' referenced before assignment But add nonlocal total at the top of `_func()` and it works fine (displays 0, 1, 3, 6, 10, 15, ...). So it's not really about what ":=" does, but about how ":=" affects scope in synthesized nested functions. But if you wrote a nested function yourself? There's no suggestion here that ":=" have any effect on scope decisions in explicitly given nested functions (same as for "=", it would imply "the target is local"), just on those generated "by magic" for listcomps/genexps. Maybe there should be, though. My initial thought was "no, because the user has total control over scope decisions in explicitly given functions today, but if something was magically made nonlocal they would have no way to override that".

So the way I envision it is that *in the absence of a nonlocal or global declaration in the containing scope*, := inside a comprehension or genexpr causes the compiler to assign to a local in the containing scope, which is elevated to a cell (if it isn't already). If there is an explicit nonlocal or global declaration in the containing scope, that is honored. Examples: # Simplest case, neither nonlocal nor global declaration def foo(): [p := q for q in range(10)] # Creates foo-local variable p print(p) # Prints 9 # There's a nonlocal declaration def bar(): p = 42 # Needed to determine its scope def inner(): nonlocal p [p := q for q in range(10)] # Assigns to p in bar's scope inner() print(p) # Prints 9 # There's a global declaration def baz(): global p [p := q for q in range(10)] baz() print(p) # Prints 9 All these would work the same way if you wrote list(p := q for q in range(10)) instead of the comprehension. We should probably define what happens when you write [p := p for p in range(10)]. I propose that this overwrites the loop control variable rather than creating a second p in the containing scope -- either way it's probably a typo anyway. := outside a comprehension/genexpr is treated just like any other assignment (other than in-place assignment), i.e. it creates a local unless a nonlocal or global declaration exists. -- --Guido van Rossum (python.org/~guido)

So the way I envision it is that *in the absence of a nonlocal or global
This seems to be getting awfully complicated. Proof? Try to write the docs for the proposed semantics. I don't understand why we went so astray from the original requirements, which could all be met by having `if` and `while` accept `as` to bind an expression to a variable that would be local to the structured statement. Cheers, -- Juancarlo *Añez*

On Tue, May 08, 2018 at 01:28:59PM -0400, Juancarlo Añez wrote:
Okay, I'll bite. I don't know why you think its complicated: it is precisely the same as ordinary ``=`` assignment scoping rules. It is comprehensions that are the special case. * * * The binding expression ``<name> := <value>`` evaluates the right hand side <value>, binds it to <name>, and then returns that value. Unless explicitly declared nonlocal or global (in which case that declaration is honoured), <name> will belong to the current scope, the same as other assignments such ``name = value``, with one difference. Inside comprehensions and generator expressions, variables created with ``for name in ...`` exist in a separate scope distinct from the usual local/nonlocal/global/builtin scopes, and are inaccessible from outside the comprehension. (They do not "leak".) That is not the case for those created with ``:=``, which belong to the scope containing the comprehension. To give an example: a = 0 x = [b := 10*a for a in (1, 2, 3)] assert x == [10, 20, 30] assert a = 0 assert b = 30
That is not the original motivation for binding expressions. The original requirements were specifically for comprehensions. https://mail.python.org/pipermail/python-ideas/2018-February/048971.html This is hardly the only time that something similar has been raised. -- Steve

[Guido]
[Juancarlo Añez <apalala@gmail.com>]
This seems to be getting awfully complicated. Proof? Try to write the docs for the proposed semantics.
Implementation details - even just partial sketches - are always "busy". Think of it this way instead: it's _currently_ the case that listcomps & genexps run in a scope S that's the same as the scope C that contains them, _except_ that names appearing as `for` targets are local to S. All other names in S resolve to exactly the same scopes they resolved to in C (local in C, global in C, nonlocal in C - doesn't matter). What changes now? Nothing in that high-level description, except that a name appearing as a binding expression target in S that's otherwise unknown in C establishes that the name is local to C. That's nothing essentially new, though - bindings _always_ establish scopes for otherwise-unknown names in Python.
"Original" depends on when you first jumped into this ;-)

[Tim] {About binding the for loop variable}
Yeah, that binding is the one I attempted to refer to. So I did understand you after all.
My naive assumption would be both. If it's just the insertion of a nonlocal statement like Tim suggested, wouldn't the comprehension blow up to: def implicitfunc() nonlocal p templist = [] for p in range(10): p = p templist.append(p) return templist ? If it were [q := p for p in range(10)], it would be: def implicitfunc() nonlocal q templist = [] for p in range(10): q = p templist.append(q) return templist Why would it need to be treated differently ? (type checkers probably should, though.)
x = x is legal. Why wouldn't p := p be ?
[Juancarlo] Maybe my distrust is just don't like the new syntax, or that I'am biased towards using "as".
I share your bias, but not your distrust. (go as !)

My apologies for something unclear in my previous mail - the second block I quoted (the one without a name) originated from Guido, not from Tim.

... [Guido]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
My naive assumption would be both.
Since this is all about scope, while I'm not 100% sure of what Guido meant, I assumed he was saying "p can only have one scope in the synthetic function: local or non-local, not both, and local is what I propose". For example, let's flesh out his example a bit more: p = 42 [p := p for p in range(10) if p == 3] print(p) # 42? 3? 9? If `p` is local to the listcomp, it must print 42. If `p` is not-local, it must print 9. If it's some weird mixture of both, 3 makes most sense (the only time `p := p` is executed is when the `for` target `p` is 3).
If it's just the insertion of a nonlocal statement like Tim suggested,
Then all occurrences of `p` in the listcomp are not-local, and the example above prints 9..
Yes.
There's no question about that one, because `q` isn't _also_ used as a `for` target. There are two "rules" here: 1. A name appearing as a `for` target is local. That's already the case. 2. All other names (including a name appearing as a binding-expression target) are not local. Clearer? If a name appears as both, which rule applies? "Both" is likely the worst possible answer, since it's incoherent ;-) If a name appears as both a `for` target and as a binding-expression target, that particular way of phrasing "the rules" suggests #1 (it's local, period) is the more natural choice. And, whether Guido consciously knows it or not, that's why he suggested it ;-)
Why would it need to be treated differently ?
Because it's incoherent. It's impossible to make the example above print 3 _merely_ by fiddling the scope of `p`. Under the covers, two distinct variables would need to be created, both of which are named `p` as far as the user can see. For my extension of Guido's example: def implicitfunc() nonlocal p templist = [] for hidden_loop_p in range(10): if hidden_loop_p == 3: p = hidden_loop_p templist.append(hidden_loop_p) return templist [Tim]
x = x is legal. Why wouldn't p := p be ?
It's easy to make it "legal": just say `p is local, period` or `p is not local, period`. The former will confuse people who think "but names appearing as binding-expression targets are not local", and the latter will confuse people who think "but names appearing as `for` targets are local". Why bother? In the absence of an actual use case (still conspicuous by absence), I'd be happiest refusing to compile such pathological code. Else `p is local, period` is the best pointless choice ;-)

With my limited experience, I'd consider 3 to make most sense, but 9 when thinking about it in the expanded form. If it's not 3 tho, then the following would make most sense: SyntaxError("Cannot re-bind for target name in a list comprehension") # Or something more clear. And the rest of that mail that convinces me even more that an error would be the correct solution here. Before I got on this mailinglist, i never even knew comprehensions introduced a new scope. I'm really that new. Two years ago I'd look up stackoverflow to check the difference between overriding and extending a method and to verify whether I made my super() calls the right way. If something goes to weird, I think just throwing exceptions is a sensible solution that keeps the language simple, rather than making that much of a headache of something so trivially avoided. Jacco

[Tim]
[Jacco van Dorp <j.van.dorp@deonet.nl>]
Good news, then: Nick & Guido recently agreed that it would be a compile-time error. Assuming it's added to the language at all, of course.
Before I got on this mailinglist, i never even knew comprehensions introduced a new scope. I'm really that new.
They didn't, at first. That changed over time. The real reason was so that `for` variables - which people typically give little thought to naming - didn't accidentally overwrite local variables that happened to share the same name. Like:
But you can productively use list comprehensions without knowing anything about how they're implemented, and just think "ha! Python does some happy magic for me there :-)".
Since Guido agreed with you in this case, that proves you're a true Pythonista - or maybe just that you're both Dutch ;-)

These all match my expectations. Some glosses: [Guido]
If the genexp/listcomp is at module level, then "assign to a local in the containing scope" still makes sense ("locals" and "globals" mean the same thing at module level), but "elevated to a cell" doesn't then - it's just a plain global. In absolutely all cases, what I expect is that NAME := EXPR in a genexp/listcomp do the binding _as if_ NAME = object_EXPR_evaluates_to were executed in the immediately containing scope. Describing the goal instead of part of the implementation may be easier to grasp ;-)
100% agreed. Add at module scope: [p := q for q in range(10)] print(p) # Prints 9 But uou're on your own for class scope, because I never did anything fancy enough at class scope to need to learn how it works ;-)
A compile-time error would be fine by me too. Creating two meanings for `p` is nuts - pick one in case of conflict. I suggested before that the first person with a real use case for this silliness should get the meaning their use case needs, but nobody bit, so "it's local then" is fine.
Also agreed. People have total control over scopes in explicitly given functions now, and if the compiler magically made anything nonlocal they would have no way to stop it. Well, I suppose we could add a "non_nonlocal" declaration, but I'd rather not ;-)

On 9 May 2018 at 03:57, Tim Peters <tim.peters@gmail.com> wrote:
I'd suggest that the handling of conflicting global and nonlocal declarations provides a good precedent here:
Since using a name as a binding target *and* as the iteration variable would effectively be declaring it as both local and nonlocal, or as local and global. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 9 May 2018 at 03:06, Guido van Rossum <guido@python.org> wrote:
How would you expect this to work in cases where the generator expression isn't immediately consumed? If "p" is nonlocal (or global) by default, then that opens up the opportunity for it to be rebound between generator steps. That gets especially confusing if you have multiple generator expressions in the same scope iterating in parallel using the same binding target: # This is fine gen1 = (p for p in range(10)) gen2 = (p for p in gen1) print(list(gen2)) # This is not (given the "let's reintroduce leaking from comprehensions" proposal) p = 0 gen1 = (p := q for q in range(10)) gen2 = (p, p := q for q in gen1) print(list(gen2)) It also reintroduces the original problem that comprehension scopes solved, just in a slightly different form: # This is fine for x in range(10): for y in range(10): transposed_related_coords = [y, x for x, y in related_coords(x, y)] # This is not (given the "let's reintroduce leaking from comprehensions" proposal) for x in range(10): for y in range(10): related_interesting_coords = [x, y for x in related_x_coord(x, y) if is_interesting(y := f(x))] Deliberately reintroducing stateful side effects into a nominally functional construct seems like a recipe for significant confusion, even if there are some cases where it might arguably be useful to folks that don't want to write a named function that returns multiple values instead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, May 10, 2018 at 5:17 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's just one of several "don't do that" situations. *What will happen* is perhaps hard to see at a glance, but it's perfectly well specified. Not all legal code does something useful though, and in this case the obvious advice should be to use different variables.
You should really read Tim's initial post in this thread, where he explains his motivation. It sounds like you're not buying it, but your example is just a case where the user is shooting themselves in the foot by reusing variable names. When writing `:=` you should always keep the scope of the variable in mind -- it's no different when using `:=` outside a comprehension. PS. Thanks for the suggestion about conflicting signals about scope; that's what we'll do. -- --Guido van Rossum (python.org/~guido)

On 10 May 2018 at 23:22, Guido van Rossum <guido@python.org> wrote:
I can use that *exact same argument* to justify the Python 2 comprehension variable leaking behaviour. We decided that was a bad idea based on ~18 years of experience with it, and there hasn't been a clear justification presented for going back on that decision presented beyond "Tim would like using it sometimes". PEP 572 was on a nice trajectory towards semantic simplification (removing sublocal scoping, restricting to name targets only, prohibiting name binding expressions in the outermost iterable of comprehensions to avoid exposing the existing scoping quirks any more than they already are), and then we suddenly had this bizarre turn into "and they're going to be implicitly nonlocal or global when used in comprehension scope".
I did, and then I talked him out of it by pointing out how confusing it would be to have the binding semantics of "x := y" be context dependent.
It *is* different, because ":=" normally binds the same as any other name binding operation including "for x in y:" (i.e. it creates a local variable), while at comprehension scope, the proposal has now become for "x := y" to create a local variable in the containing scope, while "for x in y" doesn't. Comprehension scoping is already hard to explain when its just a regular nested function that accepts a single argument, so I'm not looking forward to having to explain that "x := y" implies "nonlocal x" at comprehension scope (except that unlike a regular nonlocal declaration, it also implicitly makes it a local in the immediately surrounding scope). It isn't reasonable to wave this away as "It's only confusing to Nick because he's intimately familiar with how comprehensions are implemented", as I also wrote some of the language reference docs for the current (already complicated) comprehension scoping semantics, and I can't figure out how we're going to document the proposed semantics in a way that will actually be reasonably easy for readers to follow. The best I've been able to come up with is: - for comprehensions at function scope (including in a lambda expression inside a comprehension scope), a binding expression targets the nearest function scope, not the comprehension scope, or any intervening comprehension scope. It will appear in locals() the same way nonlocal references usually do. - for comprehensions at module scope, a binding expression targets the global scope, not the comprehension scope, or any intervening comprehension scope. It will not appear in locals() (as with any other global reference). - for comprehensions at class scope, the class scope is ignored for purposes of determining the target binding scope (and hence will implicitly create a new global variable when used in a top level class definition, and new function local when used in a class definition nested inside a function) Sublocal scopes were a model of simplicity by comparison :) Cheers, Nick. P.S. None of the above concerns apply to explicit inline scope declarations, as those are easy to explain by saying that the inline declarations work the same way as the scope declaration statements do, and can be applied universally to all name binding operations rather than being specific to ":= in comprehension scope". -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Just a quickie - I'm out of time for now. [Guido]
[Nick]
Here's the practical difference: you can't write a listcomp or genexp AT ALL without a "for" clause, so whether "for" target names leak is an issue in virtually every listcomp or genexp ever written. Here's one where it isn't: [None for somelist[12] in range(10)] Which nobody has ever seen in real life ;-) But ":=" is never required to write one - you only use it when you go out of your way to use it. I expect that will be relatively rare in real life listcomps and genexps.
and there hasn't been a clear justification presented for going back on that decision
Nobody is suggesting going back on "all and only `for` target names are local to the genexp/listcomp". To the contrary, the proposal preserves that verbatim: It's not _adding_ "oh, ya, and binding operator targets are local too". Just about everything here follows from _not_ adding that.
presented beyond "Tim would like using it sometimes".
So long as I'm the only one looking at real-life use cases, mine is the only evidence I care about ;-) I don't really care about contrived examples, unless they illustrate that a proposal is ill-defined, impossible to implement as intended, or likely to have malignant unintended consequences out-weighing their benefits.

On 2018-05-10 11:05, Tim Peters wrote:
You keep saying things like this with a smiley, and I realize you know what you're talking about (much more than I do), but I'd just like to push back a bit against that entire concept. Number one, I think many people have been bringing in real life uses cases. Number two, I disagree with the idea that looking at individual use cases and ignoring logical argumentation is the way to go. The problem with it is that a lot of the thorny issues arise in unanticipated interactions between constructs that were designed to handle separate use cases. I also do not think it's appropriate to say "if it turns out there's a weird interaction between two features, then just don't use those two things together". One of the great things about Python's design is that it doesn't just make it easy for us to write good code, but in many ways makes it difficult for us to write bad code. It is absolutely a good idea to think of the broad range of wacky things that COULD be done with a feature, not just the small range of things in the focal area of its intended use. We may indeed decide that some of the wacky cases are so unlikely that we're willing to accept them, but we can only decide that after we consider them. You seem to be suggesting that we shouldn't even bother thinking about such corner cases at all, which I think is a dangerous mistake. Taking the approach of "this individual use case justifies this individual feature", leads to things like JavaScript, a hellhole of special cases, unintended consequences, and incoherence between different corners of the language. There are real cognitive benefits to having language features make logical and conceptual sense IN ADDITION TO having practical utility, and fit together into a unified whole. Personally my feeling on this whole thread is that these changes, if implemented are likely to decrease the average readability of Python code, and I don't see the benefits as being worth the added complexity. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

[Tim]
[Brendan Barnwell <brenbarn@brenbarn.net>]
I'm not so keen on meta-discussions either ;-)
Number one, I think many people have been bringing in real life uses cases.
Keep in mind the context here: _this_ thread is specifically about listcomps and genexps. I agree there have been tons of use cases presented for statement-oriented applications (some positive for the feature, some negative), but not so much for listcomps and genexps. It's worth noting again that "the" use case that started all this long ago was a listcomp that the current PEP points out still "won't work": total = 0 progressive_sums = [total := total + value for value in data] It's obvious what that's intended to do. It's not obvious why it blows up. It's a question of scope, and the scopes of names in synthesized functions is a thoroughly legitimate thing to question. The suggestion made in the first message of this thread was the obvious scope change needed to make that example work, although I was motivated by looking at _other_ listcomp/genexp use cases. They wanted the same scope decision as the example above. But I didn't realize that the example above was essentially the same thing until after I made the suggestion.
Number two, I disagree with the idea that looking at individual use cases and ignoring logical argumentation is the way to go.
Fine, then you argue, and I'll look at use cases ;-) Seriously, I don't at all ignore argument - but, yes, arguments are secondary to me. I don't give a rip about how elegant something is if it turns out to be unusable. Conversely, I don't _much_ care about how "usable" something is if the mental model for how it works is inexplicable.
Sure.
Sometimes it is, sometimes it isn't. For example, code using threads has to be aware of literal mountains of other features that may not work well (or at all) in a multi-threaded environment without major rewriting. Something as simple as "count += 1" may fail in mysterious ways otherwise. So it goes. But note that this is easily demonstrated by realistic code.
That one I disagree with. It's very easy to write bad code in every language I'm aware of. It's just that Python programmers are too enlightened to approve of doing so ;-)
It is absolutely a good idea to think of the broad range of wacky things that COULD be done with a feature,
So present some!
To the contrary, bring 'em on. But there is no feature in Python you can't make "look bad" by contriving examples, from two-page regular expressions to `if` statements nested 16 deep. "But no sane person would do that" is usually - but not always - "refutation" enough for such stuff.
I haven't ignored that here. The scope rule for synthesized functions implementing regexps and listcomps _today_ is: The names local to that function are the names appearing as `for` targets. All other names resolve to the same scopes they resolve to in the block containing the synthesized function. The scope rule if the suggestion is adopted? The same, along with that a name appearing as a ":=" target establishes that the name is local to the containing block _if_ that name is otherwise unknown in the containing block. There's nothing incoherent or illogical about that, provided that you understand how Python scoping works at all. It's not, e.g., adding any _new_ concept of "scope" - just spelling out what the intended scopes are. Of course it's worth noting that the scope decision made for ";=" targets in listcomps/genexps differs from the decision made for `for` target names. It's use cases that decide, for me, whether that's "the tail" or "the dog". Look again at the `progressive_sums` example above, and tell me whether _you'll_ be astonished if it works. Then are you astonished that
displays 1? Either way, are you astonished that
also displays 1? If you want to argue about "logical and conceptual sense", I believe you'll get lost in abstractions unless you _apply_ your theories to realistic examples.
Of course consensus will never be reached. That's why Guido is paid riches beyond the dreams of avarice ;-)

There's a lot of things in Brendan's email which I disagree with but will skip to avoid dragging this out even further. But there's one point in particular which I think is important to comment on. On Thu, May 10, 2018 at 11:23:00AM -0700, Brendan Barnwell wrote:
I don't think this concept survives even a cursory look at the language. Make it difficult to write bad code? Let's see now: Anyone who have been caught by the "mutual default" gotcha will surely disagree: def func(arg, x=[]): ... And the closures-are-shared gotcha: py> addone, addtwo, addthree = [lambda x: x + i for i in (1, 2, 3)] py> addone(100) 103 py> addtwo(100) 103 We have no enforced encapsulation, no "private" or "protected" state for classes. Every single pure-Python class is 100% open for modification, both by subclasses and by direct monkey-patching of the class. The term "monkey-patch" was, if Wikipedia is to be believed, invented by the Python community, long before Ruby took to it as a life-style. We have no compile-time type checks to tell us off if we use the same variable as a string, a list, an int, a float and a dict all in the one function. The compiler won't warn us if we assign to something which ought to be constant. We can reach into other modules' namespaces and mess with their variables, even replacing builtins. Far from making it *hard* to do bad things, Python makes it *easy*. And that's how we love it! Consenting adults applies. We trust that code is not going to abuse these features, we trust that people aren't generally going to write list comps nested six levels deep, or dig deep into our module and monkey-patch our functions: import some_module some_module.function.__defaults__ = (123,) # Yes, this works. As a community, we use these powers wisely. We don't make a habit of shooting ourselves in the foot. We don't write impenetrable forests of nested comprehensions inside lambdas or stack ternary-if expressions six deep, or write meta-metaclasses. Binding expressions can be abused. But they have good uses too. I trust the Python community will use this for the good uses, and not change the character of the language. Just as the character of the language was not ruined by comprehensions, ternary-if or decorators. -- Steve

... [Guido]
You should really read Tim's initial post in this thread, where he explains his motivation.
[Nick]
I did, and then I talked him out of it by pointing out how confusing it would be to have the binding semantics of "x := y" be context dependent.
Ya, that was an effective Jedi mind trick when I was overdue to go to sleep ;-) To a plain user, there's nothing about a listcomp or genexp that says "new function introduced here". It looks like, for all the world, that it's running _in_ the block that contains it. It's magical enough that `for` targets magically become local. But that's almost never harmful magic, and often helpful, so worth it.
":=" target names in a genexp/listcmp are treated exactly the same as any other non-for-target name: they resolve to the same scope as they resolve to in the block that contains them. The only twist is that if such a name `x` isn't otherwise known in the block, then `x` is established as being local to the block (which incidentally also covers the case when the genexp/listcomp is at module level, where "local to the block" and "global to the block" mean the same thing). Class scope may be an exception (I cheerfully never learned anything about how class scope works, because I don't write insane code ;-) ).
It doesn't, necessarily. If `x` is already known as `global` in the block, then there's an implied `global x` at comprehension scope.
(except that unlike a regular nonlocal declaration, it also implicitly makes it a local in the immediately surrounding scope).
Only if `x` is otherwise _unknown_ in the block. If, e.g., `x` is already known in an enclosing scope E, then `x` also resolves to scope E in the comprehension. It is not made local to the enclosing scope in that case. I think it's more fruitful to explain the semantics than try to explain a concrete implementation. Python's has a "lumpy" scope system now, with hard breaks among global scopes, class scopes, and all other lexical scopes. That makes implementations artificially noisy to specify. "resolve to the same scope as they resolve to in the block that contains them, with a twist ..." avoids that noise (e.g., the words "global" and "nonlocal" don't even occur), and gets directly to the point: in which scope does a name live? If you think it's already clear enough which scope `y` resolves to in z = (x+y for x in range(10)) then it's exactly as clear which scope `y` resolves to in z = (x + (y := 7) for x in range(10)) with the twist that if `y` is otherwise unknown in the containing block, `y` becomes local to the block.
It isn't reasonable to wave this away as "It's only confusing to Nick because he's intimately familiar with how comprehensions are implemented",
As above, though, I'm gently suggesting that being so intimately familiar with implementation details may be interfering with seeing how all those details can _obscure_ rather than illuminate. Whenever you think you need to distinguish between, e.g., "nonlocal" and "global", you're too deep in the detail weeds.
Where are those docs? I expect to find such stuff in section 4 ("Execution model") of the Language Reference Manual, but listcomps and genexps are only mentioned in passing once in the 3.6.5 section 4 docs, just noting that they don't always play well at class scope.
Isn't all of that too covered by "resolve to the same scope as they resolve to in the block that contains them .."? For example, in class K: print(g) at module level, `g` obviously refers to the global `g`. Therefore any `g` appearing as a ";=" target in an immediately contained comprehension also refers to the global `g`, exactly the same as if `g` were any other non-for-target name in the comprehension. That's not a new rule: it's a consequence of how class scopes already work. Which remain inscrutable to me ;-)
You already know I'd be happy with being explicit too, but Guido didn't like it. Perhaps he'd like it better if it were even _more_ like regular declarations. Off the top of my head, say that a comprehension could start with a new optional declaration section, like def f(): g = 12 i = 8 genexp = (<global g; nonlocal i> g + (j := i*2) for i in range(2)) Of course that's contrived. When the genexp ran, the `g` would refer to the global `g` (and the f-local `g` would be ignored); the local-to-f `i` would end up bound to 1, and in this "all bindings are local by default" world the ":=" binding to `j` would simply vanish when the genexp ended. In practice, I'd be amazed to see anything much fancier than p = None # annoying but worth it ;-) that is, in this world the intended scope # for a nonlocal needs to be explicitly established while any((<nonlocal p> n % p == 0 for p in small_primes)): n //= p Note too: a binding expression (":=") isn't even needed then for this class of use case. OTOH, it's inexplicable _unless_ someone learns something about how a synthetic function is being created to implement the genexp.

On 10 May 2018 at 23:47, Tim Peters <tim.peters@gmail.com> wrote:
That's all well and good, but it is *completely insufficient for the language specification*. For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case" that you're attempting to hand wave away should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-se... appropriately. It isn't 1995 any more - while CPython is still the reference implementation for Python, we're far from being the only implementation, which means we have to be a lot more disciplined about how much we leave up to the implementation to define. The expected semantics for locals() are already sufficiently unclear that they're a source of software bugs (even in CPython) when attempting to run things under a debugger or line profiler (or anything else that sets a trace function). See https://www.python.org/dev/peps/pep-0558/ for details. "Comprehension scopes are already confusing, so it's OK to dial their weirdness all the way up to 11" is an *incredibly* strange argument to be attempting to make when the original better defined sublocal scoping proposal was knocked back as being overly confusing (even after it had been deliberately simplified by prohibiting nonlocal access to sublocals). Right now, the learning process for picking up the details of comprehension scopes goes something like this: * make the technically-incorrect-but-mostly-reliable-in-the-absence-of-name-shadowing assumption that "[x for x in data]" is semantically equivalent to a for loop (especially common for experienced Py2 devs where this really was the case!): _result = [] for x in data: _result.append(x) * discover that "[x for x in data]" is actually semantically equivalent to "list(x for x in data)" (albeit without the name lookup and optimised to avoid actually creating the generator-iterator) * make the still-technically-incorrect-but-even-more-reliable assumption that the generator expression "(x for x in data)" is equivalent to def _genexp(): for x in data: yield x _result = _genexp() * *maybe* discover that even the above expansion isn't quite accurate, and that the underlying semantic equivalent is actually this (one way to discover this by accident is to have a name error in the outermost iterable expression): def _genexp(_outermost_iter): for x in _outermost_iter: yield x _result = _genexp(_outermost_iter) * and then realise that the optimised list comprehension form is essentially this: def _listcomp(_outermost_iter): result = [] for x in _outermost_iter: result.append(x) return result _result = _listcomp(data) Now that "yield" in comprehensions has been prohibited, you've learned all the edge cases at that point - all of the runtime behaviour of things like name references, locals(), lambda expressions that close over the iteration variable, etc can be explained directly in terms of the equivalent functions and generators, so while comprehension iteration variable hiding may *seem* magical, it's really mostly explained by the deliberate semantic equivalence between the comprehension form and the constructor+genexp form. (That's exactly how PEP 3100 describes the change: "Have list comprehensions be syntactic sugar for passing an equivalent generator expression to list(); as a consequence the loop variable will no longer be exposed") As such, any proposal to have name bindings behave differently in comprehension and generator expression scope from the way they would behave in the equivalent nested function definitions *must be specified to an equivalent level of detail as the status quo*. All of the attempts at such a definition that have been made so far have been riddled with action and a distance and context-dependent compilation requirements: * whether to implicitly declare the binding target as nonlocal or global depends on whether or not you're at module scope or inside a function * the desired semantics at class scope have been left largely unclear * the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear Now, there *are* ways to resolve these problems in a coherent way, and that would be to define "parent local scoping" as a new scope type, and introduce a corresponding "parentlocal NAME" compiler declaration to explicitly request those semantics for bound names (allowing the expansions of comprehensions and generator expressions as explicitly nested functions to be adjusted accordingly). But the PEP will need to state explicitly that that's what it is doing, and fully specify how those new semantics are expected to work in *all* of the existing scope types, not just the two where the desired behaviour is relatively easy to define in terms of nonlocal and global. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, May 11, 2018 at 9:15 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Not quite! You missed one, just because comprehensions aren't weird enough yet. AFAIK you can't tell with the list comp, but with the genexp you can (by not iterating over it).
It's actually this: def _genexp(_outermost_iter): for x in _outermost_iter: yield x _result = _genexp(iter(_outermost_iter)) I don't think there's anything in the main documentation that actually says this, although PEP 289 mentions it in the detaily bits. [1] ChrisA [1] https://www.python.org/dev/peps/pep-0289/#the-details

(Note: this is an off-topic side thread, unrelated to assignment expressions. Inline comment below.) On Fri, May 11, 2018 at 9:08 AM, Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure this is the whole story. I tried to figure out how often __iter__ is called in a genexpr. I found that indeed I see iter() is called as soon as the generator is brought to life, but it is *not* called a second time the first time you call next(). However the translation you show has a 'for' loop which is supposed to call iter() again. So how is this done? It seems the generated bytecode isn't equivalent to a for-loop, it's equivalent to s while loop that just calls next(). Disassembly of a regular generator: def foo(a): for x in a: yield x * 2 0 SETUP_LOOP 18 (to 20)* 2 LOAD_FAST 0 (a) * 4 GET_ITER* >> 6 FOR_ITER 10 (to 18) 8 STORE_FAST 1 (x) 10 LOAD_FAST 1 (x) 12 YIELD_VALUE 14 POP_TOP 16 JUMP_ABSOLUTE 6 >> 18 POP_BLOCK >> 20 LOAD_CONST 0 (None) 22 RETURN_VALUE But for a generator: g = (x for x in C()) 1 0 LOAD_FAST 0 (.0) >> 2 FOR_ITER 10 (to 14) 4 STORE_FAST 1 (x) 6 LOAD_FAST 1 (x) 8 YIELD_VALUE 10 POP_TOP 12 JUMP_ABSOLUTE 2 >> 14 LOAD_CONST 0 (None) 16 RETURN_VALUE Note the lack of SETUP_LOOP and GET_ITER (but otherwise they are identical). -- --Guido van Rossum (python.org/~guido)

[Tim[
[Nick]
That's all well and good, but it is *completely insufficient for the language specification*.
I haven't been trying to write reference docs here, but so far as supplying a rigorous specification goes, I maintain the above gets "pretty close". It needs more words, and certainly isn't in the _style_ of Python's current reference docs, but that's all repairable. Don't dismiss it just because it's brief. Comprehensions already exist in the language, and so do nested scopes, so it's not necessary for this PEP to repeat any of the stuff that goes into those. Mostly it needs to specify the scopes of assignment expression target names - and the _intent_ here is really quite simple. Here with more words, restricted to the case of assignment expressions in comprehensions (the only case with any subtleties): Consider a name `y` appearing in the top level of a comprehension as an assignment expression target, where the comprehension is immediately contained in scope C, and the names belonging to scopes containing C have already been determined: ... (y := expression) ... We can ignore that `y` also appears as a `for` target at the comprehension's top level, because it was already decided that's a compile-time error. Consider what the scope of `y` would be if `(y := expression)` were textually replaced by `(y)`. Then what would the scope of `y` be? The answer relies solely on what the docs _already_ specify. There are three possible answers: 1. The docs say `y` belongs to scope S (which may be C itself, or a scope containing C). Then y's scope in the original comprehension is S. 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. 3. The docs are unclear about whether #1 or #2 applies. Then the language is _already_ ill-defined. It doesn't matter to this whether the assignment expression is, or is not, in the expression that defines the iterable for the outermost `for`. What about that is hand-wavy? Defining semantics clearly and unambiguously doesn't require specifying a concrete implementation (the latter is one possible way to achieve the goal - but _here_ it's a convoluted PITA because Python has no way to explicitly declare intended scopes). Since all questions about scope are reduced by the above to questions about Python's _current_ scope rules, it's as clear and unambiguous as Python's current scope rules. Now those may not be the _intended_ rules in all cases. That deserves deep scrutiny. But claiming it's too vague to scrutinize doesn't fly with me. If there's a scope question you suspect can't be answered by the above, or that the above gives an unintended answer to, by all means bring that up! If your question isn't about scope, then I'd probably view it as being irrelevant to the current PEP (e.g., what `locals()` returns depends on how the relevant code object attributes are set, which are in turn determined by which scopes names belong to relative to the code block's local scope, and it's certainly not _this_ PEP's job to redefine what `locals()` does with that info). Something to note: for-target names appearing in the outermost `for` _may_ have different scopes in different parts of the comprehension. y = 12 [y for y in range(y)] There the first two `y`'s have scope local to the comprehension, but the last `y` is local to the containing block. But an assignment expression target name always has the same scope within a comprehension. In that specific sense, their scope rules are "more elegant" than for-target names. This isn't a new rule, but a logical consequence of the scope-determining algorithm given above. It's a _conceptual_ consequence of that assignment statement targets are "intended to act like" the bindings are performed _in_ scope C rather than in the comprehension's scope. And that's no conceptually weirder than that it's _already_ the case that the expression defining the iterable of the outermost `for` _is_ evaluated in scope C (which I'm not a fan of, but which is rhetorically convenient to mention here ;-) ). As I've said more than once already, I don't know whether this should apply to comprehensions at class scope too - I've never used a comprehension in class scope, and doubt I ever will. Without use cases I'm familiar with, I have no idea what might be most useful there. Best uninformed guess is that the above makes decent sense at class scope too, especially given that I've picked up on that people are already baffled by some comprehension behavior at class scope. I suspect that you already know, but find it rhetorically convenient to pretend this is all so incredibly unclear you can't possibly guess ;-)
For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case"
Which are?
that you're attempting to hand wave away
Not attempting to wave them way - don't know what you're referring to. The proposed scope rules are defined entirely by straightforward reference to existing scope rules - and stripped of all the excess verbiage amount to no more than "same scope in the comprehension as in the containing scope".
should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-se...
Thanks for the link! I hadn't seen that before. If the PEP gets that far, I'd think harder about how it really "ought to be" documented. I think, e.g., that scope issues should be more rigorously handled in section 4.2 (which is about binding and name resolution).
What in the "more words" above was left to the implementation's discretion? I can already guess you don't _like_ the way it's worded, but that's not what I'm asking about.
As above, what does that have to do with PEP 572? The docs you referenced as a model don't even mention `locals()` - but PEP 572 must? Well, fine: from the explanation above, it's trivially deduced that all names appearing as assignment expression targets in comprehensions will appear as free variables in their code blocks, except for when they resolve to the global scope. In the former case, looks like `locals()` will return them, despite that they're _not_ local to the block. But that's the same thing `locals()` does for free variables created via any means whatsoever - it appears to add all the names in code_object.co_freevars to the returned dict. I have no idea why it acts that way, and wouldn't have done it that way myself. But if that's "a bug", it would be repaired for the PEP 572 cases at the same time and in the same way as for all other freevars cases. Again, the only thing at issue here is specifying intended scopes. There's nothing inherently unique about that..
That's an extreme characterization of what, in reality, is merely specifying scopes. That total = 0 sums = [total := total + value for value in data] blows up without the change is at least as confusing - and is more confusing to me.
I'm done arguing about this part ;-)
Right now, the learning process for picking up the details of comprehension scopes goes something like this:
Who needs to do this? I'm not denying that many people do, but is that a significant percentage of those who merely want to _use_ comprehensions? We already did lots of heroic stuff apparently attempting to cater to those who _don't_ want to learn about their implementation, like evaluating the outer iterable "at once" outside the comprehension scope, and - indeed - bothering to create a new scope for them at all. Look at the "total := total + value" example again and really try to pretend you don't know anything about the implementation. "It works!" is a happy experience :-) For the rest of this message, it's an entertaining and educational development. I'm not clear on what it has to do with the PEP, though.
I don't see any of those Python workalike examples in the docs. So which "status quo" are you referring to? You already know it's possible, and indeed straightforward, to write functions that model the proposed scope rules in any given case, so what;s your real point? They're "just like" the stuff above, possibly adding a sprinkling of "nonlocal" and/or "global" declarations. They don't require changing anything fundamental about the workalike examples you've already given - just adding cruft to specify scopes. I don't want to bother doing it here, because it's just tedious, and you _already know_ it. Most tediously, because there's no explicit way to declare a non-global scope in Python, in the """ 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. """ case it's necessary to do something like: if 0: y = None in the scope containing the synthetic function so that the contained "nonlocal y" declaration knows which scope `y` is intended to live in. (The "if 0:" block is optimized out of existence, but after the compiler has noticed the local assignment to `y` and so records that `y` is containing-scope-local.) Crap like that isn't really illuminating.
That's artificial silliness, though. Already suggested that Python repair one of its historical scope distinctions by teaching `nonlocal` that nonlocal x in a top-level function is a synonym for global x in a top-level function. In every relevant conceptual sense, the module scope _is_ the top-level lexical scope. It seems pointlessly pedantic to me to insist that `nonlocal` _only_ refer to a non-global enclosing lexical scope. Who cares? The user-level semantically important part is "containing scope", not "is implemented by a cell object". In the meantime, BFD. So long as the language keyword insists on making that distinction, ya, it's a distinction that needs to be made by users too (and by the compiler regardless). This isn't some inherently new burden for the compiler either. When it sees a not-local name in a function, it already has to figure out whether to reference a cell or pump out a LOAD_GLOBAL opcode.
* the desired semantics at class scope have been left largely unclear
Covered before. Someone who knows something about _desired_ class scope behavior needs to look at that. That's not me.
* the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear
See the "more words" version above. It implies that scopes need to be resolved "outside in" for nesting of any kind. Which they need to be anyway, e.g., to make the "is this not-local name a cell or a global?" distinction in any kind of function code.
Sorry, I don't know what that means. I don't even know what "compiler declaration" alone means. Regardless, there's nothing here that can't be explained easily enough by utterly vanilla lexically nested scopes. All the apparent difficulties stem from the inability to explicitly declare a name's intended scope, and that the "nonlocal" keyword in a top-level function currently refuses to acknowledge that the global scope _is_ the containing not-local scope. If you mean adding a new statement to Python parentlocal NAME ... sure, that could work. But it obscures that the problem just isn't hard enough to require such excessive novelty in Python's scope gimmicks. The correct place to declare NAME's scope is _in_ NAME's intended scope, the same as in every other language with lexical scoping. There's also that the plain English meaning of "parent local' only applies to rule #2 at the top, and to the proper subset of cases in rule #1 where it turns out that S is C. In the other rule #1 cases, "parentlocal" would be a misleading name for the less specific "nonlocal" or the more specific "global". Writing workalike functions by hand isn't difficult regardless, just tedious (even without the current proposal!), and I don't view it as a significant use case regardless. I expect the minority who do it have real fun with it for a day or two, and then quite possibly never again. Which is a fair summary of my own life ;-)
So you finally admit they _are_ relatively easy to define ;-) What, specifically, _are_ "*all" of the existing scope types"? There are only module, class, and function scopes in my view of the world. (and "comprehension scope" is just a name given at obvious times to function scope in my view of the world). If you also want piles of words about, e.g., how PEP 572 acts in all cases in smaller blocks, like code typed at a shell, or strings passed to eval() or exec(), you'll first have to explain why this was never necessary for any previous feature. PS: I hope you appreciate that I didn't whine about microscopic differences in the workalike examples' generated byte code ;-)

[Tim]
Something related to ponder: what's the meaning of the following _without_ the proposed scope change? So the Golden Binding Rule (GBR) applies then: GBR: binding a name by any means always makes the name local to the block the binding appears in, unless the name is declared "global" or "nonlocal" in the block. def f(): ys = [y for _ in range(y := 5)] The second instance of `y` is local - but local to what? Since the range is evaluated _in_ f's scope, presumably that instance of `y` is local to `f`. What about the first instance of `y`? Is that _not_ local to the comprehension despite that the GBR insists it must be local to the comprehension? Or does it raise UnboundLocalError for consistency with the GBR, and "well, so just don't use any name in a comprehension that appears as an assignment expression target in the expression defining the iterable for the outermost `for` ". Or is it that despite that `range(y := 5)` is executed in f's scope, the _binding_ is actually performed in the comprehension's scope to a comprehension-local `y`, to both preserve GBR and avoid the UnboundLocalError? . But then what if `print(y)` is added after? If `range(y := 5)` really was executed in f's scope, surely that must print 5. Then what about [y for y in range(y := 5)] ? Now that there's another binding inside the comprehension establishing that `y` is local to the comprehension "for real", does that work fine and the rule changes to well, so just don't use any name in a comprehension that appears as an assignment expression target in the expression E defining the iterable for the outermost `for` - unless the name is _also_ used in a binding context in the comprehension outside of E too ? Or is that a compile-time error despite that the first 2 y's are now obviously comprehension-local and the final y obviously f-local? Or are assignment expressions disallowed in the expression defining the iterable for the outermost `for`, and both examples are compile-time errors? Talk about incoherent ;-) Under the proposed change, all instances of `y` are local to `f` in the first example, and the second example is a compile-time error for a _coherent_ reason (the ":=" binding implies "not local" for `y` - which has nothing to do with that it's in the outermost `for` -, but the "for y in" binding implies "local" for `y`).

Just showing an example of "by hand" code emulating nesting of comprehensions, with a highly dubious rebinding, in the inner comprehension, of an outer comprehension's local for-target. list(i + sum((i := i+1) + i for j in range(i)) for i in range(5)) I don't believe I have compelling use cases for nesting listcomps/genexps, so that's just made up to be an example of atrocious feature abuse :-) In the outer genexp, `i` is obviously local, as is `j` in the inner genexp. But the assignment expression in the inner genexp demands that `i` _there_ be not-local. To which scope does the inner `i` belong? To the same scope it would belong if `i := i+1` were replaced by `i`, which the docs today say is the outer genexp's scope. So that's what it is. Here's code to emulate all that, with a bit more to demonstrate that `i` and `j` in the scope containing that statement remain unchanged: The only "novelty" is that a `nonlocal` declaration is needed to establish an intended scope. def f(): i = 42 j = 53 def outer(it): def inner(it): nonlocal i for j in it: i = i+1 yield i for i in it: yield i + sum(inner(range(i))) + i print(list(outer(range(5)))) print(i, j) f() The output: [0, 5, 13, 24, 38] 42 53 Since the code is senseless, so is the list it generates ;-) Showing it this way may make it clearer: [0+(0)+0, 1+(2)+2, 2+(3+4)+4, 3+(4+5+6)+6, 4+(5+6+7+8)+8]

Ah, fudge - I pasted in the wrong "high-level" code. Sorry! The code that's actually being emulated is not
list(i + sum((i := i+1) + i for j in range(i)) for i in range(5))
but list(i + sum((i := i+1) for j in range(i)) + i for i in range(5))
...
I have piles of these, but they're all equally tedious so I'll stop with this one ;-)

[Nick]
That's all well and good, but it is *completely insufficient for the language specification*.
And if you didn't like those words, you're _really_ gonna hate this ;-) I don't believe more than just the following is actually necessary, although much more than this would be helpful. I spent the first 15-plus years of my career writing compilers for a living, so am sadly resigned to the "say the minimum necessary for a long argument to conclude that it really was the minimum necessary" style of language specs. That's why I exult in giving explanations and examples that might actually be illuminating - it's not because I _can't_ be cryptically terse ;-) Section 4.2.1 (Binding of names) of the Language Reference Manual has a paragraph starting with "The following constructs bind names:". It really only needs another two-sentence paragraph after that to capture all of the PEP's intended scope semantics (including my suggestion): """ An assignment expression binds the target, except in a function F synthesized to implement a list comprehension or generator expression (see XXX). In the latter case, if the target is not in F's environment (see section 4.2.2) , the target is bound in the block containing F. """ That explicitly restates my earlier "rule #2" in the language already used by the manual. My "rule #1" essentially vanishes as such, because it's subsumed by what the manual already means by "F's environment". This may also be the best place to add another new sentence.: """ Regardless, if the target also appears as an identifier target of a `for` loop header in F, a `SyntaxError` exception is raised. """ Earlier, for now-necessary disambiguation, I expect that in ... targets that are identifiers if occurring in an assignment, ... " statement" should be inserted before the comma.

[Tim, suggests changes to the Reference Manual's 4.2.1]
Let me try that again ;-) The notion of "environment" includes the global scope, but that's not really wanted here. "Environment" has more of a runtime flavor anyway. And since nobody will tell me anything about class scope, I read the docs myself ;-) And that's a problem, I think! If a comprehension C is in class scope S, apparently the class locals are _not_ in C's environment. Since C doesn't even have read access to S's locals, it seems to me bizarre that ":=" could _create_ a local in S. Since I personally couldn't care less about running comprehensions of any kind at class scope, I propose to make `:=` a SyntaxError if someone tries to use a comprehension with ':=' at class scope (of course they may be able to use ":=" in nested comprehensions anyway - not that anyone would). If someone objects to that, fine, you figure it out ;-) So here's another stab. """ An assignment expression binds the target, except in a function F synthesized to implement a list comprehension or generator expression (see XXX). In the latter case[1]: - If the target also appears as an identifier target of a `for` loop header in F, a `SyntaxError` exception is raised. - If the block containing F is a class block, a `SyntaxError` exception is raised. - If the target is not local to any function enclosing F, and is not declared `global` in the block containing F, then the target is bound in the block containing F. Footnote: [1] The intent is that runtime binding of the target occurs as if the binding were performed in the block containing F. Because that necessarily makes the target not local in F, it's an error if the target also appears in a `for` loop header, which is a local binding for the same target. If the containing block is a class block, F has no access to that block's scope, so it doesn't make sense to consider the containing block. If the target is already known to the containing block, the target inherits its scope resolution from the containing block. Else the target is established as local to the containing block. """ I realize the docs don't generally use bullet lists. Convert to WallOfText if you must. The material in the footnote would usually go in a "Rationale" doc instead, but we don't have one of those, and I think the intent is too hard to deduce without that info. And repeating the other point, to keep a self-contained account:

On 05/12/2018 11:41 PM, Tim Peters wrote:
Python 3.7.0b3+ (heads/bpo-33217-dirty:28c1790, Apr 5 2018, 13:10:10) [GCC 4.8.2] on linux Type "help", "copyright", "credits" or "license" for more information. --> class C: ... huh = 7 ... hah = [i for i in range(huh)] ... --> C.hah [0, 1, 2, 3, 4, 5, 6] Same results clear back to 3.3 (the oldest version of 3 I have). Are the docs wrong? Or maybe they just refer to functions: --> class C: ... huh = 7 ... hah = [i for i in range(huh)] ... heh = lambda: [i for i in range(huh)] ... --> C.hah [0, 1, 2, 3, 4, 5, 6] --> C.heh() Traceback (most recent call last): File "test_class_comp.py", line 7, in <module> print(C.heh()) File "test_class_comp.py", line 4, in <lambda> heh = lambda: [i for i in range(huh)] NameError: global name 'huh' is not defined So a class-scope comprehension assignment expression should behave as you originally specified. -- ~Ethan~

[Tim[
[Ethan Furman <ethan@stoneleaf.us>]
As Chris already explained (thanks!), the expression defining the iterable for the outermost `for` (which, perhaps confusingly, is the _leftmost_ `for`) is treated specially in a comprehension (or genexp), evaluated at once _in_ the scope containing the comprehension, not in the comprehension's own scope. Everything else in the comprehension is evaluated in the comprehension's scope. I just want to add that it's really the same thing as your lambda example. Comprehensions are also implemented as lambdas (functions), but invisible functions created by magic. The synthesized function takes one argument, which is the expression defining the iterable for the outermost `for`. So, skipping irrelevant-to-the-point details, your original example is more like: class C: huh = 7 def _magic(it): return [i for i in it] hah = _magic(range(huh)) Since the `range(huh)` part is evaluated _in_ C's scope, no problem. For a case that blows up, as Chris did you can add another `for` as "outermost", or just try to reference a class local in the body of the comprehension: class C2: huh = 7 hah = [huh for i in range(5)] That blows up (NameError on `huh`) for the same reason your lambda example blows up, because it's implemented like: class C: huh = 7 def _magic(it): return [huh for i in it] hah = _magic(range(5)) and C's locals are not in the environment seen by any function called from C's scope. A primary intent of the proposed ":= in comprehensions" change is that you _don't_ have to learn this much about implementation cruft to guess what a comprehension will do when it contains an assignment expression. The intent of total = 0 sums = [total := total + value for value in data] is obvious - until you think too much about it ;-) Because there's no function in sight, there's no reason to guess that the `total` in `total = 0` has nothing to do with the instances of `total` inside the comprehension. The point of the change is to make them all refer to the same thing, as they already do in (the syntactically similar, but useless): total = 0 sums = [total == total + value for value in data] Except even _that_ doesn't work "as visually expected" in class scope today. The `total` inside the comprehension refers to the closest (if any) scope (_containing_ the `class` statement) in which `total` is local (usually the module scope, but may be a function scope if the `class` is inside nested functions). In function and module scopes, the second `total` example does work in "the obvious" way, so in those scopes I'd like to see the first `total` example do so too.

[Tim]
FYI, that's still not right, but I've been distracted by trying to convince myself that the manual actually defines what happens when absurdly deeply nested functions mix local values for a name at some levels with a `global` declaration of the name at other levels. I suspect that the above should be reworded to the simpler: - If the target is not declared `global` or `nonlocal` in the block containing F, then the target is bound in the block containing F. That makes "intuitive sense" because if the target is declared `global` or `nonlocal` the meaning of binding in the block is already defined to affect a not-local scope, while if it's not declared at all then binding in the block "should" establish that it's local.to the block (regardless of how containing scopes treat the same name) But whether that all follows from what the manual already says requires more staring at it ;-) Regardless, if anyone were to point it out, I'd agree that it _should_ count against this that establishing which names are local to a block may require searching top-level comprehensions in the block for assignment expressions. On a scale of minus a million to plus a million, I'd only weight that in the negative thousands, though ;-)

[Nick Coghlan <ncoghlan@gmail.com> ]
I'm most interested in what sensible programmers can do easily that's of use, not really about pathologies that can be contrived.
Sure.
# This is not (given the "let's reintroduce leaking from comprehensions" proposal)
Be fair: it's not _re_introducing anything. It's brand new syntax for which "it's a very much intended feature" that a not-local name can be bound. You have to go out of your way to use it. Where it doesn't do what you want, don't use it.
p = 0
I'm not sure of the intent of that line. If `p` is otherwise unknown in this block, its appearance as a binding operator target in an immediately contained genexp establishes that `p` is local to this block. So `p = 0` here just establishes that directly. Best I can guess, the 0 value is never used below.
gen1 = (p := q for q in range(10))
I expect that's a compile time error, grouping as gen1 = (p := (q for q in range(10))) but without those explicit parentheses delimiting the "genexp part" it may not be _recognized_ as being a genexp. With the extra parens, it binds both `gen1` and `p` to the genexp, and `p` doesn't appear in the body of the genexp at all. Or did you intend gen1 = ((p := q) for q in range(10)) ? I'll assume that's so.
gen2 = (p, p := q for q in gen1)
OK, I really have no guess about the intent there. Note that gen2 = (p, q for q in gen1) is a syntax error today, while gen2 = (p, (q for q in gen1)) builds a 2-tuple. Perhaps gen2 = ((p, p := q) for q in gen1) was intended? Summarizing: gen1 = ((p := q) for q in range(10)) gen2 = ((p, p := q) for q in gen1) is my best guess.
print(list(gen2))
[(0, 0), (1, 1), (2, 2), ..., (9, 9)] But let's not pretend it's impossible to do that today; e.g., this code produces the same: class Cell: def __init__(self, value=None): self.bind(value) def bind(self, value): self.value = value return value p = Cell() gen1 = (p.bind(q) for q in range(10)) gen2 = ((p.value, p.bind(q)) for q in gen1) print(list(gen2)) Someone using ":=" INTENDS to bind the name, just as much as someone deliberately using that `Cell` class.
I'm not clear on what "This is fine" means, other than that the code does whatever it does. That's part of why I so strongly prefer real-life use cases. In the code above, I can't imagine what the intent of the code might be _unless_ they're running tons of otherwise-useless code for _side effects_ performed by calling `related_coords()`. If "it's functional", they could do the same via x = y = 9 transposed_related_coords = [y, x for x, y in related_coords(x, y)] except that's a syntax error ;-) I assume transposed_related_coords = [(y, x) for x, y in related_coords(x, y)] was intended. BTW, I'd shoot anyone who tried to check in that code today ;-) It inherently relies on that the name `x` inside the listcomp refers to two entirely different scopes, and that's Poor Practice (the `x` in the `related_coords()` call refers to the `x` in `for x in range(10)`, but all other instances of `x` refer to the listcomp-local `x`).
Same syntax error there (you need parens around "x, y" at the start of the listcomp). Presumably they _intended_ to build (x, f(x)) pairs when and only when `f(x)` "is interesting". In what specific way does the code fail to do that? Yes, the outer `y` is rebound, but what of it? When the statement completes, `y` will be rebound to the next value from the inner range(10), and that's the value of `y` seen by `related_x_coord(x, y)` the next time the loop body runs. The binding done by `:=` is irrelevant to that. So I don't see your point in that specific example, although - sure! - of course it's possible to contrive examples where it really would matter. For example, change the above in some way to use `x` as the binding operator target inside the listcomp. Then that _could_ affect the value of `x` seen by `related_x_coord(x, y)` across inner loop iterations.
Deliberately reintroducing stateful side effects into a nominally functional construct seems like a recipe for significant confusion,
Side effects of any kind anywhere can create significant confusion. But Python is not a functional language, and it you don't want side effects due to ":=" in synthetic functions, you're not required to use ":=" in that context. That said, I agree "it would be nice" if advanced users had a way to explicitly say which scope they want.
even if there are some cases where it might arguably be useful to folks that don't want to write a named function that returns multiple values instead.
Sorry, I didn't follow that - functions returning multiple values?

On 5/7/2018 1:38 PM, Guido van Rossum wrote:
If I am understanding correctly, this would also let one *intentionally 'leak' (export) the last value of the loop variable when wanted. [math.log(xlast:=x) for x in it if x > 0] print(xlast)
This is a special case of the fact that no function called in class scope can access class variables, even if defined in the class scope.
Traceback (most recent call last): File "<pyshell#5>", line 1, in <module> class C: File "<pyshell#5>", line 5, in C z = f() File "<pyshell#5>", line 4, in f return x NameError: name 'x' is not defined I would find it strange if only functions defined by a comprehension were given new class scope access.
To me, this is the prime justification for the 3.0 comprehension change. I currently see a comprehension as a specialized generator expression. A generator expression generalizes math set builder notation. If left 'raw', the implied function yields the values generated (what else could it do?). If a collection type is indicated by the fences and expression form, values are instead added to an anonymous instance thereof. -- Terry Jan Reedy

On Mon, May 07, 2018 at 10:38:09AM -0700, Guido van Rossum wrote:
It doesn't get my hackles up as much as you, but its not really what I want. It's just a compromise between what I *don't* want (1), which fails to solve the original motivating example that started this discussion, and what Chris was pushing back against (2).
+1 Whether the current class behaviour is "broken" or desirable or somewhere in between, it is what we have now and its okay if binding expressions have the same behaviour. -- Steve

yes. I have some probably tangential to bad arguments but I'm going to make them anyways, because I think := makes the most sense along with SLNB. first, := vs post-hoc (e.g. where or given) base case: [ x for x in range(1) ] while obvious to all of us, reading left to right does not yield what x is till later. [ (x, y) for x in range(1) for y in range(1) ] doubly so. If x or y were defined above, it would not be clear until the right end if what contex they had. [ (x, y) for x in range(n) given y = f(n) ] i dont know what's the iterator till after 'for' [ (x, y:=f(n) for x in range(n) ] At a minimum, I learn immediately that y is not the iterator. Slightly less cognitive load. it's not that one is better, or that either is unfamiliar, it's about having to hold a "promise" in my working memory, vs getting an immediate assignment earlier. (it's a metric!) now my silly argument. ":" is like a "when" operator. if y==x:

On May 6, 2018 8:41:26 PM Tim Peters <tim.peters@gmail.com> wrote:
Couldn't you just do: def first(it): return next(it, None) while (item := first(p for p in small_primes if n % p == 0)): # ... IMO for pretty much anything more complex, it should probably be a loop in its own function.
-- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else https://refi64.com/

[Tim]
[Ryan Gonzalez <rymg19@gmail.com>]
In the "different thread" I mentioned above, I already noted that kind of spelling. I'm not at a loss to think of many ways to spell it ;-) The point of this thread was intended to be about the semantics of binding expressions in comprehensions. For that purpose, the PEP noting that total = 0 progressive_sums = [total := total + value for value in data] fails too is equally relevant. Of course there are many possible ways to rewrite that too that would work. That doesn't change that the failing attempts "look like they should work", but don't, but could if the semantics of ":=" were defined differently inside magically-created anonymous lexically nested functions

On 7 May 2018 at 11:32, Tim Peters <tim.peters@gmail.com> wrote:
You have the reasoning there backwards: implicitly nested scopes behave like explicitly nested scopes because that was the *easy* way for me to implement them in Python 3.0 (since I got to re-use all the pre-existing compile time and runtime machinery that was built to handle explicit lexical scopes). Everything else I tried (including any suggestions made by others on the py3k mailing list when I discussed the problems I was encountering) ran into weird corner cases at either compile time or run time, so I eventually gave up and proposed that the implicit scope using to hide the iteration variable name binding be a full nested closure, and we'd just live with the consequences of that. The sublocal scoping proposal in the earlier drafts of PEP 572 was our first serious attempt at defining a different way of doing things that would allow names to be hidden from surrounding code while still being visible in nested suites, and it broke people's brains to the point where Guido explicitly asked Chris to take it out of the PEP :) However, something I *have* been wondering is whether or not it might make sense to allow inline scoping declarations in comprehension name bindings. Then your example could be written: def ...: p = None while any(n % p for nonlocal p in small_primes): # p was declared as nonlocal in the nested scope, so our p points to the last bound value Needing to switch from "nonlocal p" to "global p" at module level would likely be slightly annoying, but also a reminder that the bound name is now visible as a module attribute. If any other form of comprehension level name binding does eventually get accepted, then inline scope declarations could similarly be used to hoist values out into the surrounding scope: rem = None while any((nonlocal rem := n % p) for nonlocal p in small_primes): # p and rem were declared as nonlocal in the nested scope, so our rem and p point to the last bound value Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 7 May 2018 at 12:51, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thinking about it a little further, I suspect the parser would reject "nonlocal name := ..." as creating a parsing ambiguity at statement level (where it would conflict with the regular nonlocal declaration statement). The extra keyword in the given clause would avoid that ambiguity problem: p = rem = None while any(rem for nonlocal p in small_primes given nonlocal rem = n % p): # p and rem were declared as nonlocal in the nested scope, so our p and rem refer to their last bound values Such a feature could also be used to make augmented assignments do something useful at comprehension scope: input_tally = 0 process_inputs(x for x in input_iter given nonlocal input_tally += x) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Tim]
[Nick Coghlan <ncoghlan@gmail.com>]
You have the reasoning there backwards:
That's easy to believe - I also had a long history of resisting nested scopes at all ;-)
It's unfortunate that there are "consequences" at all. That kind of thing is done all the time in Lisp-ish languages, but they require explicitly declaring names' scopes. Python's "infer scope instead by looking for bindings" worked great when it had 3 scopes total, but keeps forcing "consequences" that may or may not be desired in a generally-nested-scopes world.
To which removal I was sympathetic, BTW.
Which more directly addresses the underlying problem: not really "binding expressions" per se, but the lack of control over scope decisions in comprehensions period. It's not at all that nested scopes are a poor model, it's that we have no control over what's local _to_ nested scopes the language creates. I'd say that's worth addressing in its own right, regardless of PEP 572's fate. BTW, the "p = None" there is annoying too ;-)
Or `nonlocal` could be taught that its use one level below `global` has an obvious meaning: global.
Right - as above, inline scope declarations would be applicable to all forms of comprehension-generated code. And to any other future construct that creates lexically nested functions.

On 2018-05-06 18:32, Tim Peters wrote:
I agree that is a limitation, and I see from a later message in the thread that Guido finds it compelling, but personally I don't find that that particular case such a showstopper that it would tip the scales for me either way. If you have to write the workalike look that iterates and returns the missing value, so be it. That's not a big deal. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

[Tim]
[Brendan Barnwell <brenbarn@brenbarn.net>]
Guido didn't find it compelling: for that specific example to show `p` would require that for-loop targets "leak", and he remains opposed to that. I don't want that changed either. The issue instead is what the brand-new proposed ":=" should do, which isn't used in that example at all. Whether that specific example can be written in 500 other ways (of course it can) isn't really relevant. One of the ironies already noted is that PEP 572 gives an example of something that _won't_ work ("progressive_sums") which happens to be the very use case that started the current debate about assignment expressions to begin with. That raises the very same issue about ":=" that "the obvious" rewrite of my example at the top raises. Which suggests to me (& apparently to Guido too) that there may be a real issue here worth addressing. There are many use cases for binding expressions outside of synthetically generated functions. For PEP 572, it's the totality that will be judged, not just how they might work inside list comprehensions and generator expressions (the only topics in _this_ particular thread), let alone how they work in one specific example.
participants (12)
-
Brendan Barnwell
-
Chris Angelico
-
Ethan Furman
-
Guido van Rossum
-
Jacco van Dorp
-
Juancarlo Añez
-
Matt Arcidy
-
Nick Coghlan
-
Ryan Gonzalez
-
Steven D'Aprano
-
Terry Reedy
-
Tim Peters