On Wed, Jun 27, 2018 at 08:30:00AM +0100, Paul Moore wrote:
On 27 June 2018 at 07:54, Steven D'Aprano firstname.lastname@example.org wrote:
Comprehensions already run partly in the surrounding scope.
Given the code shown:
def test(): a = 1 b = 2 result = [value for key, value in locals().items()] return result
But test() returns [1, 2]. So does that say (as you claim above) that "the comprehension ran in the enclosing scope"? Doesn't it just say that the outermost iterable runs in the enclosing scope?
I think I was careful enough to only say that this was the same result you would get *if* the comprehension ran in the outer scope. Not to specifically say it *did* run in the outer scope. (If I slipped up anywhere, sorry.)
I did say that the comprehension runs *partly* in the surrounding scope, and the example shows that the local namespace in the "... in iterable" part is not the same as the (sub)local namespace in the "expr for x in ..." part.
*Parts* of the comprehension run in the surrounding scope, and parts of it run in an implicit sublocal scope inside a hidden function, giving us a quite complicated semantics for "comprehension scope":
[expression for a in first_sequence for b in second ... ] |------sublocal-----|----local-----|------sublocal------|
Try fitting *that* in the LEGB (+class) acronym :-)
This becomes quite relevant once we include assignment expressions. To make the point that this is not specific to := but applies equally to Nick's "given" syntax as well, I'm going to use his syntax:
result = [a for a in (x given x = expensive_function(), x+1, 2*x, x**3)]
Here, the assignment to x runs in the local part. I can simulate that right now, using locals, but only outside of a function due to CPython's namespace optimization inside functions. (For simplicity, I'm just going to replace the call to "expensive_function" with just a constant.)
py> del x py> [a for a in (locals().__setitem__('x', 2) or x, x+1, 2*x, x**3)] [2, 3, 4, 8] py> x 2
This confirms that the first sequence part of the comprehension runs in the surrounding local scope.
So far so good. What if we move that assignment one level deep? Unfortunately, I can no longer use locals for this simulation, due to a peculiarity of the CPython function implementation. But replacing the call to locals() with globals() does the trick:
del x # simulate [b*a for b in (1,) for a in (x given x = 2, x+1, 2*x, x**3)] [b*a for b in (1,) for a in (globals().__setitem__('x', 2) or x, x+1, 2*x, x**3)]
That also works. But the problem comes if the user tries to assign to x in both the local and a sublocal section:
# no simulation here, sorry [b*a for b in (x given x = 2, x**2) for a in (x given x = x + 1, x**3)]
That looks like it should work. You're assigning to the same x in two parts of the same expression. Where's the problem?
But given the "implicit function" implementation of comprehensions, I expect that this ought to raise an UnboundLocalError. The local scope part is okay:
# needs a fixed-width font for best results [b*a for b in (x given x = 2, x**2) for a in (x given x = x + 1, x**3)] ..............|-----local part----|.....|--------sublocal part--------|
but the sublocal part defines x as a sublocal variable, shadowing the surrounding local x, then tries to get a value for that sublocal x before it is defined.
If we had assignment expressions before generator expressions and comprehensions, I don't think this would have been the behaviour we desired.
(We might, I guess, accept it as an acceptable cost of the implicit function implementation. But we surely wouldn't argue for this complicated scoping behaviour as a good thing in and of itself.)
In any case, we can work around this (at some cost of clarity and unobviousness) by changing the name of the variable. Not a big burden when the variable is a single character x:
[b*a for b in (x given x = 2, x**2) for a in (y given y = x + 1, y**3)]
but if x is a more descriptive name, that becomes more annoying. Nevermind, it is a way around this.
Or we could Just Make It Work by treating the entire comprehension as the same scope for assignment expressions. (I stress, not for the loop variable.) Instead of having to remember which bits of the comprehension run in which scope, we have a conceptually much simpler rule:
- comprehensions are expressions, and assignments inside them bind to the enclosing local scope, just like other expressions:
- except for the loop variables, which are intentionally encapsulated inside the comprehension and don't "leak".
The *implementation details* of how that works are not conceptually relevant. We may or may not want to advertise the fact that comprehensions use an implicit hidden function to do the encapsulation, and implicit hidden nonlocal to undo the effects of that hidden function. Or whatever implementation we happen to use.
So everybody expected the actual behaviour?
More or less, if we ignore a few misapprehensions about how locals works.
On the other hand,
... a = 1 ... b = 2 ... result = [locals().items() for v in 'a'] ... return result ...
[dict_items([('v', 'a'), ('.0', <str_iterator object at 0x0000015AA0BDE8D0>)])]
and I bet no-one would have expected that if you'd posed that question
I suspect not. To be honest, I didn't even think of asking that question until after I had asked the first.
The problem is that := allows you to *change* values in a scope, and at that point you need to know *which* scope. So to that extent, the locals() question is important. However, I still suspect that most people would answer that they would like := to assign values *as if* they were in the enclosing scope,
That is my belief as well. But that was intentionally not the question I was asking. I was interested in seeing whether people thought of comprehensions as a separate scope, or part of the enclosing scope.