[Python-ideas] A comprehension scope issue in PEP 572

Thu May 10 12:38:52 EDT 2018

On 10 May 2018 at 23:22, Guido van Rossum <guido at python.org> wrote:

> On Thu, May 10, 2018 at 5:17 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> How would you expect this to work in cases where the generator expression
>> isn't immediately consumed? If "p" is nonlocal (or global) by default, then
>> that opens up the opportunity for it to be rebound between generator steps.
>> That gets especially confusing if you have multiple generator expressions
>> in the same scope iterating in parallel using the same binding target:
>>
>>     # This is fine
>>     gen1 = (p for p in range(10))
>>     gen2 = (p for p in gen1)
>>     print(list(gen2))
>>
>>     # This is not (given the "let's reintroduce leaking from
>> comprehensions" proposal)
>>     p = 0
>>     gen1 = (p := q for q in range(10))
>>     gen2 = (p, p := q for q in gen1)
>>     print(list(gen2))
>>
>
> That's just one of several "don't do that" situations. *What will happen*
> is perhaps hard to see at a glance, but it's perfectly well specified. Not
> all legal code does something useful though, and in this case the obvious
> advice should be to use different variables.
>

I can use that *exact same argument* to justify the Python 2 comprehension
variable leaking behaviour. We decided that was a bad idea based on ~18
years of experience with it, and there hasn't been a clear justification
presented for going back on that decision presented beyond "Tim would like
using it sometimes".

PEP 572 was on a nice trajectory towards semantic simplification (removing
sublocal scoping, restricting to name targets only, prohibiting name
binding expressions in the outermost iterable of comprehensions to avoid
exposing the existing scoping quirks any more than they already are), and
then we suddenly had this bizarre turn into "and they're going to be
implicitly nonlocal or global when used in comprehension scope".

>  It also reintroduces the original problem that comprehension scopes
> solved, just in a slightly different form:
>
>     # This is fine
>>     for x in range(10):
>>         for y in range(10):
>>             transposed_related_coords = [y, x for x, y in
>> related_coords(x, y)]
>>
>>     # This is not (given the "let's reintroduce leaking from
>> comprehensions" proposal)
>>     for x in range(10):
>>         for y in range(10):
>>             related_interesting_coords = [x, y for x in
>> related_x_coord(x, y) if is_interesting(y := f(x))]
>>
>> Deliberately reintroducing stateful side effects into a nominally
>> functional construct seems like a recipe for significant confusion, even if
>> there are some cases where it might arguably be useful to folks that don't
>> want to write a named function that returns multiple values instead.
>>
>
> You should really read Tim's initial post in this thread, where he
> explains his motivation.
>

I did, and then I talked him out of it by pointing out how confusing it
would be to have the binding semantics of "x := y" be context dependent.

> It sounds like you're not buying it, but your example is just a case where
> the user is shooting themselves in the foot by reusing variable names. When
> writing `:=` you should always keep the scope of the variable in mind --
> it's no different when using `:=` outside a comprehension.
>

It *is* different, because ":=" normally binds the same as any other name
binding operation including "for x in y:" (i.e. it creates a local
variable), while at comprehension scope, the proposal has now become for "x
:= y" to create a local variable in the containing scope, while "for x in
y" doesn't. Comprehension scoping is already hard to explain when its just
a regular nested function that accepts a single argument, so I'm not
looking forward to having to explain that "x := y" implies "nonlocal x" at
comprehension scope (except that unlike a regular nonlocal declaration, it
also implicitly makes it a local in the immediately surrounding scope).

It isn't reasonable to wave this away as "It's only confusing to Nick
because he's intimately familiar with how comprehensions are implemented",
as I also wrote some of the language reference docs for the current
(already complicated) comprehension scoping semantics, and I can't figure
out how we're going to document the proposed semantics in a way that will
actually be reasonably easy for readers to follow.

The best I've been able to come up with is:

- for comprehensions at function scope (including in a lambda expression
inside a comprehension scope), a binding expression targets the nearest
function scope, not the comprehension scope, or any intervening
comprehension scope. It will appear in locals() the same way nonlocal
references usually do.
- for comprehensions at module scope, a binding expression targets the
global scope, not the comprehension scope, or any intervening comprehension
scope. It will not appear in locals() (as with any other global reference).
- for comprehensions at class scope, the class scope is ignored for
purposes of determining the target binding scope (and hence will implicitly
create a new global variable when used in a top level class definition, and
new function local when used in a class definition nested inside a function)

Sublocal scopes were a model of simplicity by comparison :)

Cheers,
Nick.

P.S. None of the above concerns apply to explicit inline scope
declarations, as those are easy to explain by saying that the inline
declarations work the same way as the scope declaration statements do, and
can be applied universally to all name binding operations rather than being
specific to ":= in comprehension scope".

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180511/20705419/attachment.html>