On 10 May 2018 at 23:22, Guido van Rossum <guido@python.org> wrote:
On Thu, May 10, 2018 at 5:17 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
How would you expect this to work in cases where the generator expression isn't immediately consumed? If "p" is nonlocal (or global) by default, then that opens up the opportunity for it to be rebound between generator steps. That gets especially confusing if you have multiple generator expressions in the same scope iterating in parallel using the same binding target:

    # This is fine
    gen1 = (p for p in range(10))
    gen2 = (p for p in gen1)

    # This is not (given the "let's reintroduce leaking from comprehensions" proposal)
    p = 0
    gen1 = (p := q for q in range(10))
    gen2 = (p, p := q for q in gen1)

That's just one of several "don't do that" situations. *What will happen* is perhaps hard to see at a glance, but it's perfectly well specified. Not all legal code does something useful though, and in this case the obvious advice should be to use different variables.

I can use that *exact same argument* to justify the Python 2 comprehension variable leaking behaviour. We decided that was a bad idea based on ~18 years of experience with it, and there hasn't been a clear justification presented for going back on that decision presented beyond "Tim would like using it sometimes".

PEP 572 was on a nice trajectory towards semantic simplification (removing sublocal scoping, restricting to name targets only, prohibiting name binding expressions in the outermost iterable of comprehensions to avoid exposing the existing scoping quirks any more than they already are), and then we suddenly had this bizarre turn into "and they're going to be implicitly nonlocal or global when used in comprehension scope".
 It also reintroduces the original problem that comprehension scopes solved, just in a slightly different form:

    # This is fine
    for x in range(10):
        for y in range(10):
            transposed_related_coords = [y, x for x, y in related_coords(x, y)]

    # This is not (given the "let's reintroduce leaking from comprehensions" proposal)
    for x in range(10):
        for y in range(10):
            related_interesting_coords = [x, y for x in related_x_coord(x, y) if is_interesting(y := f(x))]

Deliberately reintroducing stateful side effects into a nominally functional construct seems like a recipe for significant confusion, even if there are some cases where it might arguably be useful to folks that don't want to write a named function that returns multiple values instead.

You should really read Tim's initial post in this thread, where he explains his motivation.

I did, and then I talked him out of it by pointing out how confusing it would be to have the binding semantics of "x := y" be context dependent.
It sounds like you're not buying it, but your example is just a case where the user is shooting themselves in the foot by reusing variable names. When writing `:=` you should always keep the scope of the variable in mind -- it's no different when using `:=` outside a comprehension.

It *is* different, because ":=" normally binds the same as any other name binding operation including "for x in y:" (i.e. it creates a local variable), while at comprehension scope, the proposal has now become for "x := y" to create a local variable in the containing scope, while "for x in y" doesn't. Comprehension scoping is already hard to explain when its just a regular nested function that accepts a single argument, so I'm not looking forward to having to explain that "x := y" implies "nonlocal x" at comprehension scope (except that unlike a regular nonlocal declaration, it also implicitly makes it a local in the immediately surrounding scope).

It isn't reasonable to wave this away as "It's only confusing to Nick because he's intimately familiar with how comprehensions are implemented", as I also wrote some of the language reference docs for the current (already complicated) comprehension scoping semantics, and I can't figure out how we're going to document the proposed semantics in a way that will actually be reasonably easy for readers to follow.

The best I've been able to come up with is:

- for comprehensions at function scope (including in a lambda expression inside a comprehension scope), a binding expression targets the nearest function scope, not the comprehension scope, or any intervening comprehension scope. It will appear in locals() the same way nonlocal references usually do.
- for comprehensions at module scope, a binding expression targets the global scope, not the comprehension scope, or any intervening comprehension scope. It will not appear in locals() (as with any other global reference).
- for comprehensions at class scope, the class scope is ignored for purposes of determining the target binding scope (and hence will implicitly create a new global variable when used in a top level class definition, and new function local when used in a class definition nested inside a function)

Sublocal scopes were a model of simplicity by comparison :)


P.S. None of the above concerns apply to explicit inline scope declarations, as those are easy to explain by saying that the inline declarations work the same way as the scope declaration statements do, and can be applied universally to all name binding operations rather than being specific to ":= in comprehension scope".

Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia