[Tim[
... ":=" target names in a genexp/listcmp are treated exactly the same as any other non-for-target name: they resolve to the same scope as they resolve to in the block that contains them. The only twist is that if such a name `x` isn't otherwise known in the block, then `x` is established as being local to the block (which incidentally also covers the case when the genexp/listcomp is at module level, where "local to the block" and "global to the block" mean the same thing). Class scope may be an exception (I cheerfully never learned anything about how class scope works, because I don't write insane code ;-) ).
[Nick]
That's all well and good, but it is *completely insufficient for the language specification*.
I haven't been trying to write reference docs here, but so far as supplying a rigorous specification goes, I maintain the above gets "pretty close". It needs more words, and certainly isn't in the _style_ of Python's current reference docs, but that's all repairable. Don't dismiss it just because it's brief. Comprehensions already exist in the language, and so do nested scopes, so it's not necessary for this PEP to repeat any of the stuff that goes into those. Mostly it needs to specify the scopes of assignment expression target names - and the _intent_ here is really quite simple. Here with more words, restricted to the case of assignment expressions in comprehensions (the only case with any subtleties): Consider a name `y` appearing in the top level of a comprehension as an assignment expression target, where the comprehension is immediately contained in scope C, and the names belonging to scopes containing C have already been determined: ... (y := expression) ... We can ignore that `y` also appears as a `for` target at the comprehension's top level, because it was already decided that's a compile-time error. Consider what the scope of `y` would be if `(y := expression)` were textually replaced by `(y)`. Then what would the scope of `y` be? The answer relies solely on what the docs _already_ specify. There are three possible answers: 1. The docs say `y` belongs to scope S (which may be C itself, or a scope containing C). Then y's scope in the original comprehension is S. 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. 3. The docs are unclear about whether #1 or #2 applies. Then the language is _already_ ill-defined. It doesn't matter to this whether the assignment expression is, or is not, in the expression that defines the iterable for the outermost `for`. What about that is hand-wavy? Defining semantics clearly and unambiguously doesn't require specifying a concrete implementation (the latter is one possible way to achieve the goal - but _here_ it's a convoluted PITA because Python has no way to explicitly declare intended scopes). Since all questions about scope are reduced by the above to questions about Python's _current_ scope rules, it's as clear and unambiguous as Python's current scope rules. Now those may not be the _intended_ rules in all cases. That deserves deep scrutiny. But claiming it's too vague to scrutinize doesn't fly with me. If there's a scope question you suspect can't be answered by the above, or that the above gives an unintended answer to, by all means bring that up! If your question isn't about scope, then I'd probably view it as being irrelevant to the current PEP (e.g., what `locals()` returns depends on how the relevant code object attributes are set, which are in turn determined by which scopes names belong to relative to the code block's local scope, and it's certainly not _this_ PEP's job to redefine what `locals()` does with that info). Something to note: for-target names appearing in the outermost `for` _may_ have different scopes in different parts of the comprehension. y = 12 [y for y in range(y)] There the first two `y`'s have scope local to the comprehension, but the last `y` is local to the containing block. But an assignment expression target name always has the same scope within a comprehension. In that specific sense, their scope rules are "more elegant" than for-target names. This isn't a new rule, but a logical consequence of the scope-determining algorithm given above. It's a _conceptual_ consequence of that assignment statement targets are "intended to act like" the bindings are performed _in_ scope C rather than in the comprehension's scope. And that's no conceptually weirder than that it's _already_ the case that the expression defining the iterable of the outermost `for` _is_ evaluated in scope C (which I'm not a fan of, but which is rhetorically convenient to mention here ;-) ). As I've said more than once already, I don't know whether this should apply to comprehensions at class scope too - I've never used a comprehension in class scope, and doubt I ever will. Without use cases I'm familiar with, I have no idea what might be most useful there. Best uninformed guess is that the above makes decent sense at class scope too, especially given that I've picked up on that people are already baffled by some comprehension behavior at class scope. I suspect that you already know, but find it rhetorically convenient to pretend this is all so incredibly unclear you can't possibly guess ;-)
For the language spec, we have to be able to tell implementation authors exactly how all of the "bizarre edge case"
Which are?
that you're attempting to hand wave away
Not attempting to wave them way - don't know what you're referring to. The proposed scope rules are defined entirely by straightforward reference to existing scope rules - and stripped of all the excess verbiage amount to no more than "same scope in the comprehension as in the containing scope".
should behave by updating https://docs.python.org/dev/reference/expressions.html#displays-for-lists-se...
Thanks for the link! I hadn't seen that before. If the PEP gets that far, I'd think harder about how it really "ought to be" documented. I think, e.g., that scope issues should be more rigorously handled in section 4.2 (which is about binding and name resolution).
appropriately. It isn't 1995 any more - while CPython is still the reference implementation for Python, we're far from being the only implementation, which means we have to be a lot more disciplined about how much we leave up to the implementation to define.
What in the "more words" above was left to the implementation's discretion? I can already guess you don't _like_ the way it's worded, but that's not what I'm asking about.
The expected semantics for locals() are already sufficiently unclear that they're a source of software bugs (even in CPython) when attempting to run things under a debugger or line profiler (or anything else that sets a trace function). See https://www.python.org/dev/peps/pep-0558/ for details.
As above, what does that have to do with PEP 572? The docs you referenced as a model don't even mention `locals()` - but PEP 572 must? Well, fine: from the explanation above, it's trivially deduced that all names appearing as assignment expression targets in comprehensions will appear as free variables in their code blocks, except for when they resolve to the global scope. In the former case, looks like `locals()` will return them, despite that they're _not_ local to the block. But that's the same thing `locals()` does for free variables created via any means whatsoever - it appears to add all the names in code_object.co_freevars to the returned dict. I have no idea why it acts that way, and wouldn't have done it that way myself. But if that's "a bug", it would be repaired for the PEP 572 cases at the same time and in the same way as for all other freevars cases. Again, the only thing at issue here is specifying intended scopes. There's nothing inherently unique about that..
"Comprehension scopes are already confusing, so it's OK to dial their weirdness all the way up to 11" is an *incredibly* strange argument to be attempting
That's an extreme characterization of what, in reality, is merely specifying scopes. That total = 0 sums = [total := total + value for value in data] blows up without the change is at least as confusing - and is more confusing to me.
to make when the original better defined sublocal scoping proposal was knocked back as being overly confusing (even after it had been deliberately simplified by prohibiting nonlocal access to sublocals).
I'm done arguing about this part ;-)
Right now, the learning process for picking up the details of comprehension scopes goes something like this:
Who needs to do this? I'm not denying that many people do, but is that a significant percentage of those who merely want to _use_ comprehensions? We already did lots of heroic stuff apparently attempting to cater to those who _don't_ want to learn about their implementation, like evaluating the outer iterable "at once" outside the comprehension scope, and - indeed - bothering to create a new scope for them at all. Look at the "total := total + value" example again and really try to pretend you don't know anything about the implementation. "It works!" is a happy experience :-) For the rest of this message, it's an entertaining and educational development. I'm not clear on what it has to do with the PEP, though.
* make the technically-incorrect-but-mostly-reliable-in-the-absence-of-name-shadowing assumption that "[x for x in data]" is semantically equivalent to a for loop (especially common for experienced Py2 devs where this really was the case!):
_result = [] for x in data: _result.append(x)
* discover that "[x for x in data]" is actually semantically equivalent to "list(x for x in data)" (albeit without the name lookup and optimised to avoid actually creating the generator-iterator) * make the still-technically-incorrect-but-even-more-reliable assumption that the generator expression "(x for x in data)" is equivalent to
def _genexp(): for x in data: yield x
_result = _genexp()
* *maybe* discover that even the above expansion isn't quite accurate, and that the underlying semantic equivalent is actually this (one way to discover this by accident is to have a name error in the outermost iterable expression):
def _genexp(_outermost_iter): for x in _outermost_iter: yield x
_result = _genexp(_outermost_iter)
* and then realise that the optimised list comprehension form is essentially this:
def _listcomp(_outermost_iter): result = [] for x in _outermost_iter: result.append(x) return result
_result = _listcomp(data)
Now that "yield" in comprehensions has been prohibited, you've learned all the edge cases at that point - all of the runtime behaviour of things like name references, locals(), lambda expressions that close over the iteration variable, etc can be explained directly in terms of the equivalent functions and generators, so while comprehension iteration variable hiding may *seem* magical, it's really mostly explained by the deliberate semantic equivalence between the comprehension form and the constructor+genexp form. (That's exactly how PEP 3100 describes the change: "Have list comprehensions be syntactic sugar for passing an equivalent generator expression to list(); as a consequence the loop variable will no longer be exposed")
As such, any proposal to have name bindings behave differently in comprehension and generator expression scope from the way they would behave in the equivalent nested function definitions *must be specified to an equivalent level of detail as the status quo*.
I don't see any of those Python workalike examples in the docs. So which "status quo" are you referring to? You already know it's possible, and indeed straightforward, to write functions that model the proposed scope rules in any given case, so what;s your real point? They're "just like" the stuff above, possibly adding a sprinkling of "nonlocal" and/or "global" declarations. They don't require changing anything fundamental about the workalike examples you've already given - just adding cruft to specify scopes. I don't want to bother doing it here, because it's just tedious, and you _already know_ it. Most tediously, because there's no explicit way to declare a non-global scope in Python, in the """ 2. The docs say name `y` is unknown. Then y's scope in the original comprehension is C. """ case it's necessary to do something like: if 0: y = None in the scope containing the synthetic function so that the contained "nonlocal y" declaration knows which scope `y` is intended to live in. (The "if 0:" block is optimized out of existence, but after the compiler has noticed the local assignment to `y` and so records that `y` is containing-scope-local.) Crap like that isn't really illuminating.
All of the attempts at such a definition that have been made so far have been riddled with action and a distance and context-dependent compilation requirements:
* whether to implicitly declare the binding target as nonlocal or global depends on whether or not you're at module scope or inside a function
That's artificial silliness, though. Already suggested that Python repair one of its historical scope distinctions by teaching `nonlocal` that nonlocal x in a top-level function is a synonym for global x in a top-level function. In every relevant conceptual sense, the module scope _is_ the top-level lexical scope. It seems pointlessly pedantic to me to insist that `nonlocal` _only_ refer to a non-global enclosing lexical scope. Who cares? The user-level semantically important part is "containing scope", not "is implemented by a cell object". In the meantime, BFD. So long as the language keyword insists on making that distinction, ya, it's a distinction that needs to be made by users too (and by the compiler regardless). This isn't some inherently new burden for the compiler either. When it sees a not-local name in a function, it already has to figure out whether to reference a cell or pump out a LOAD_GLOBAL opcode.
* the desired semantics at class scope have been left largely unclear
Covered before. Someone who knows something about _desired_ class scope behavior needs to look at that. That's not me.
* the desired semantics in the case of nested comprehensions and generator expressions has been left entirely unclear
See the "more words" version above. It implies that scopes need to be resolved "outside in" for nesting of any kind. Which they need to be anyway, e.g., to make the "is this not-local name a cell or a global?" distinction in any kind of function code.
Now, there *are* ways to resolve these problems in a coherent way, and that would be to define "parent local scoping" as a new scope type, and introduce a corresponding "parentlocal NAME" compiler declaration to explicitly request those semantics for bound names (allowing the expansions of comprehensions and generator expressions as explicitly nested functions to be adjusted accordingly).
Sorry, I don't know what that means. I don't even know what "compiler declaration" alone means. Regardless, there's nothing here that can't be explained easily enough by utterly vanilla lexically nested scopes. All the apparent difficulties stem from the inability to explicitly declare a name's intended scope, and that the "nonlocal" keyword in a top-level function currently refuses to acknowledge that the global scope _is_ the containing not-local scope. If you mean adding a new statement to Python parentlocal NAME ... sure, that could work. But it obscures that the problem just isn't hard enough to require such excessive novelty in Python's scope gimmicks. The correct place to declare NAME's scope is _in_ NAME's intended scope, the same as in every other language with lexical scoping. There's also that the plain English meaning of "parent local' only applies to rule #2 at the top, and to the proper subset of cases in rule #1 where it turns out that S is C. In the other rule #1 cases, "parentlocal" would be a misleading name for the less specific "nonlocal" or the more specific "global". Writing workalike functions by hand isn't difficult regardless, just tedious (even without the current proposal!), and I don't view it as a significant use case regardless. I expect the minority who do it have real fun with it for a day or two, and then quite possibly never again. Which is a fair summary of my own life ;-)
But the PEP will need to state explicitly that that's what it is doing, and fully specify how those new semantics are expected to work in *all* of the existing scope types, not just the two where the desired behaviour is relatively easy to define in terms of nonlocal and global.
So you finally admit they _are_ relatively easy to define ;-) What, specifically, _are_ "*all" of the existing scope types"? There are only module, class, and function scopes in my view of the world. (and "comprehension scope" is just a name given at obvious times to function scope in my view of the world). If you also want piles of words about, e.g., how PEP 572 acts in all cases in smaller blocks, like code typed at a shell, or strings passed to eval() or exec(), you'll first have to explain why this was never necessary for any previous feature. PS: I hope you appreciate that I didn't whine about microscopic differences in the workalike examples' generated byte code ;-)