[Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)
Steven D'Aprano
steve at pearwood.info
Sat Jun 30 04:17:02 EDT 2018
On Wed, Jun 27, 2018 at 09:52:43PM -0700, Chris Barker wrote:
> It seems everyone agrees that scoping rules should be the same for
> generator expressions and comprehensions,
Yes. I dislike saying "comprehensions and generator expressions" over
and over again, so I just say "comprehensions".
Principle One:
- we consider generator expressions to be a lazy comprehension;
- or perhaps comprehensions are eager generator expressions;
- either way, they behave the same in regard to scoping rules.
Principle Two:
- the scope of the loop variable stays hidden inside the
sub-local ("comprehension" or "implicit hidden function")
scope;
- i.e. it does not "leak", even if you want it to.
Principle Three:
- glossing over the builtin name look-up, calling list(genexpr)
will remain equivalent to using a list comprehension;
- similarly for set and dict comprehensions.
Principle Four:
- comprehensions (and genexprs) already behave "funny" inside
class scope; any proposal to fix class scope is beyond the,
er, scope of this PEP and can wait for another day.
So far, there should be (I hope!) no disagreement with those first four
principles. With those four principles in place, teaching and using
comprehensions (genexprs) in the absense of assignment expressions does
not need to change one iota.
Normal cases stay normal; weird cases mucking about with locals() inside
the comprehension are already weird and won't change.
> So what about:
>
> l = [x:=i for i in range(3)]
>
> vs
>
> g = (x:=i for i in range(3))
>
> Is there any way to keep these consistent if the "x" is in the regular
> local scope?
Yes. That is what closures already do.
We already have such nonlocal effects in Python 3. Move the loop inside
an inner (nested) function, and then either call it immediately to
simulate the effect of a list comprehension, or delay calling it to
behave more like a generator expression.
Of course the *runtime* effects depend on whether or not the generator
expression is actually evaluated. But that's no mystery, and is
precisely analogous to this case:
def demo():
x = 1
def inner():
nonlocal x
x = 99
inner() # call the inner function
print(x)
This prints 99. But if you comment out the call to the inner function,
it prints 1. I trust that doesn't come as a surprise.
Nor should this come as a surprise:
def demo():
x = 1
# assuming assignment scope is local rather than sublocal
g = (x:= i for i in (99,))
L = list(g)
print(x)
The value of x printed will depend on whether or not you comment out
the call to list(g).
> Note that this thread is titled "Informal educator feedback on PEP 572".
>
> As an educator -- this is looking harder an harder to explain to newbies...
>
> Though easier if any assignments made in a "comprehension" don't "leak out".
Let me introduce two more principles.
Principle Five:
- all expressions are executed in the local scope.
Principle Six:
- the scope of an assignment expression variable inside a
comprehension (genexpr) should not depend on where inside
the comprehension it sits.
Five is, I think, so intuitive that we forget about it in the same way
that we forget about the air we breathe. It would be surprising, even
shocking, if two expressions in the same context were executed in
different scopes:
result = [x + 1, x - 2]
If the first x were local and the second was global, that would be
disturbing. The same rule ought to apply if we include assignment
expressions:
result = [(x := expr) + 1, x := x - 2]
It would be disturbing if the first assignment (x := expr) executed in
the local scope, and the second (x := x - 2) failed with NameError
because it was executed in the global scope.
Or worse, *didn't* fail with NameError, but instead returned something
totally unexpected.
Now bring in a comprehension:
result = [(x := expr) + 1] + [x := x - 2 for a in (None,)]
Do you still want the x inside the comprehension to be a different x to
the one outside the comprehension? How are you going to explain that
UnboundLocalError to your students?
That's not actually a rhetorical question. I recognise that while
Principle Five seems self-evidently desirable to me, you might consider
it less important than the idea that "assignments inside comprehensions
shouldn't leak".
I believe that these two expressions should give the same results even
to the side-effects:
[(x := expr) + 1, x := x - 2]
[(x := expr) + 1] + [x := x - 2 for a in (None,)]
I think that is the simplest and most intuitive behaviour, the one
which will be the least surprising, cause the fewest unexpected
NameErrors, and be the simplest to explain.
If you still prefer the "assignments shouldn't leak" idea, consider
this: under the current implementation of comprehensions as an implicit
hidden function, the scope of a variable depends on *where* it is,
violating Principle Six.
(That was the point of my introducing locals() into a previous post: to
demonstrate that, today, right now, "comprehension scope" is a misnomer.
Comprehensions actually execute in a hybrid of at least two scopes, the
surrounding local scope and the sublocal hidden implicit function
scope.)
Let me bring in another equivalency:
[(x := expr) + 1, x := x - 2]
[(x := expr) + 1] + [x := x - 2 for a in (None,)]
[(x := expr) + 1] + [a for a in (x := x - 2,)]
By Principle Six, the side-effect of assigning to x shouldn't depend on
where inside the comprehension it is. The two comprehension expressions
shown ought to be referring to the same "x" variable (in the same scope)
regardless of whether that is the surrounding local scope, or a sublocal
comprehension scope.
(In the case of it being a sublocal scope, the two comprehensions will
raise UnboundLocalError.)
But -- and this is why I raised all that hoo-ha about locals() --
according to the current implementation, they *don't*. This version
would assign to x in the sublocal scope:
# best viewed in a monospaced font
[x := x - 2 for a in (None,)]
^^^^^^^^^^ this is sublocal scope
but this would assign in the surrounding local scope:
[a for a in (x := x - 2,)]
^^^^^^^^^^^^^ this is local scope
I strongly believe that all three ought to be equivalent, including
side-effects. (Remember that by Principle Two, we agree that the loop
variable doesn't leak. The loop variable is invisible from the outside
and doesn't count as a side-effect for this discussion.)
So here are three possibilities (assuming assignment expressions are
permitted):
1. Nick doesn't like the idea of having to inject an implicit
"nonlocal" into the comprehension hidden implicit function;
if we don't, that gives us the case where the scope of
assignment variables depends on where they are in the
comprehension, and will sometimes leak and sometimes not.
This torpedoes Princple Six, and leaves you having to explain why
assignment sometimes "works" inside comprehensions and sometimes gives
UnboundLocalError.
2. If we decide that assignment inside a comprehension should always
be sublocal, the implementation becomes more complex in order to
bury the otherwise-local scope beneath another layer of even more
hidden implicit functions.
That rules out some interesting (but probably not critical) uses of
assignment expressions inside comprehensions, such as using them as a
side-channel to sneak out debugging information.
And it adds a great big screaming special case to Principle Five:
- all expressions, EXCEPT FOR THE INSIDE OF COMPREHENSIONS, are
executed in the local scope.
3. Or we just make all assignments inside comprehensions (including gen
exprs) occur in the surrounding local scope.
Number 3 is my strong preference. It complicates the implementation a
bit (needing to add some implicit nonlocals) but not as much as needing
to hide the otherwise-local scope beneath another implicit function. And
it gives by far the most consistent, useful and obvious semantics out of
the three options.
My not-very-extensive survey on the Python-List mailing lists suggests
that, if you don't ask people explicitly about "assignment expressions",
they already think of the inside of comprehensions as being part of the
surrounding local scope rather than a hidden inner function. So I do not
believe that this will be hard to teach.
These two expressions ought to give the same result with the same
side-effect:
[x := 1]
[x := a for a in (1,)]
That, I strongly believe, is the inuitive behaviour to peope who aren't
immersed in the implementation details of comprehensions, as well as
being the most useful.
--
Steve
More information about the Python-Dev
mailing list