[Python-ideas] A comprehension scope issue in PEP 572

Thu May 10 09:22:17 EDT 2018

On Thu, May 10, 2018 at 5:17 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 9 May 2018 at 03:06, Guido van Rossum <guido at python.org> wrote:
>
>> So the way I envision it is that *in the absence of a nonlocal or global
>> declaration in the containing scope*, := inside a comprehension or genexpr
>> causes the compiler to assign to a local in the containing scope, which is
>> elevated to a cell (if it isn't already). If there is an explicit nonlocal
>> or global declaration in the containing scope, that is honored.
>>
>> Examples:
>>
>>   # Simplest case, neither nonlocal nor global declaration
>>   def foo():
>>       [p := q for q in range(10)]  # Creates foo-local variable p
>>       print(p)  # Prints 9
>>
>>   # There's a nonlocal declaration
>>   def bar():
>>       p = 42  # Needed to determine its scope
>>       def inner():
>>           nonlocal p
>>           [p := q for q in range(10)]  # Assigns to p in bar's scope
>>       inner()
>>       print(p)  # Prints 9
>>
>>   # There's a global declaration
>>   def baz():
>>       global p
>>       [p := q for q in range(10)]
>>   baz()
>>   print(p)  # Prints 9
>>
>> All these would work the same way if you wrote list(p := q for q in
>> range(10)) instead of the comprehension.
>>
>
> How would you expect this to work in cases where the generator expression
> isn't immediately consumed? If "p" is nonlocal (or global) by default, then
> that opens up the opportunity for it to be rebound between generator steps.
> That gets especially confusing if you have multiple generator expressions
> in the same scope iterating in parallel using the same binding target:
>
>     # This is fine
>     gen1 = (p for p in range(10))
>     gen2 = (p for p in gen1)
>     print(list(gen2))
>
>     # This is not (given the "let's reintroduce leaking from
> comprehensions" proposal)
>     p = 0
>     gen1 = (p := q for q in range(10))
>     gen2 = (p, p := q for q in gen1)
>     print(list(gen2))
>

That's just one of several "don't do that" situations. *What will happen*
is perhaps hard to see at a glance, but it's perfectly well specified. Not
all legal code does something useful though, and in this case the obvious
advice should be to use different variables.

> It also reintroduces the original problem that comprehension scopes
> solved, just in a slightly different form:
>
>     # This is fine
>     for x in range(10):
>         for y in range(10):
>             transposed_related_coords = [y, x for x, y in
> related_coords(x, y)]
>
>     # This is not (given the "let's reintroduce leaking from
> comprehensions" proposal)
>     for x in range(10):
>         for y in range(10):
>             related_interesting_coords = [x, y for x in related_x_coord(x,
> y) if is_interesting(y := f(x))]
>
> Deliberately reintroducing stateful side effects into a nominally
> functional construct seems like a recipe for significant confusion, even if
> there are some cases where it might arguably be useful to folks that don't
> want to write a named function that returns multiple values instead.
>

You should really read Tim's initial post in this thread, where he explains
his motivation. It sounds like you're not buying it, but your example is
just a case where the user is shooting themselves in the foot by reusing
variable names. When writing `:=` you should always keep the scope of the
variable in mind -- it's no different when using `:=` outside a
comprehension.

PS. Thanks for the suggestion about conflicting signals about scope; that's
what we'll do.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180510/3718e466/attachment-0001.html>