[Python-ideas] A comprehension scope issue in PEP 572

Tim Peters tim.peters at gmail.com
Thu May 10 13:36:56 EDT 2018


[Nick Coghlan <ncoghlan at gmail.com> ]
> How would you expect this to work in cases where the generator expression
> isn't immediately consumed? If "p" is nonlocal (or global) by default, then
> that opens up the opportunity for it to be rebound between generator steps.
> That gets especially confusing if you have multiple generator expressions in
> the same scope iterating in parallel using the same binding target:

I'm most interested in what sensible programmers can do easily that's
of use, not really about  pathologies that can be contrived.

>     # This is fine
>     gen1 = (p for p in range(10))
>     gen2 = (p for p in gen1)
>     print(list(gen2))

Sure.

>
>     # This is not (given the "let's reintroduce leaking from comprehensions" proposal)

Be fair:  it's not _re_introducing anything.  It's brand new syntax
for which "it's a very much intended feature" that a not-local name
can be bound.  You have to go out of your way to use it.  Where it
doesn't do what you want, don't use it.

>     p = 0

I'm not sure of the intent of that line.  If `p` is otherwise unknown
in this block, its appearance as a binding operator target in an
immediately contained genexp establishes that `p` is local to this
block.  So `p = 0` here just establishes that directly.  Best I can
guess, the 0 value is never used below.

>     gen1 = (p := q for q in range(10))

I expect that's a compile time error, grouping as

    gen1 = (p := (q for q in range(10)))

but without those explicit parentheses delimiting the "genexp part" it
may not be _recognized_ as being a genexp.  With the extra parens, it
binds both `gen1` and `p` to the genexp, and `p` doesn't appear in the
body of the genexp at all.  Or did you intend

    gen1 = ((p := q) for q in range(10))

?  I'll assume that's so.


>     gen2 = (p, p := q for q in gen1)

OK, I really have no guess about the intent there.  Note that

    gen2 = (p, q for q in gen1)

is a syntax error today, while

    gen2 = (p, (q for q in gen1))

builds a 2-tuple.  Perhaps

    gen2 = ((p, p := q) for q in gen1)

was intended?

Summarizing:

    gen1 = ((p := q) for q in range(10))
    gen2 = ((p, p := q) for q in gen1)

is my best guess.

>     print(list(gen2))

[(0, 0), (1, 1), (2, 2), ..., (9, 9)]

But  let's not pretend it's impossible to do that today; e.g., this
code produces the same:

    class Cell:
        def __init__(self, value=None):
            self.bind(value)
        def bind(self, value):
            self.value = value
            return value

    p = Cell()
    gen1 = (p.bind(q) for q in range(10))
    gen2 = ((p.value, p.bind(q)) for q in gen1)
    print(list(gen2))

Someone using ":=" INTENDS to bind the name, just as much as someone
deliberately using that `Cell` class.

> It also reintroduces the original problem that comprehension scopes solved,
> just in a slightly different form:
>
>     # This is fine
>     for x in range(10):
>         for y in range(10):
>             transposed_related_coords = [y, x for x, y in related_coords(x, y)]

I'm not clear on what "This is fine" means, other than that the code
does whatever it does.  That's part of why I so strongly prefer
real-life use cases.  In the code above, I can't imagine what the
intent of the code might be _unless_ they're running tons of
otherwise-useless code for _side effects_ performed by calling
`related_coords()`.  If "it's functional", they could do the same via

    x = y = 9
    transposed_related_coords = [y, x for x, y in related_coords(x, y)]

except that's a syntax error ;-)  I assume

    transposed_related_coords = [(y, x) for x, y in related_coords(x, y)]

was intended.

BTW, I'd shoot anyone who tried to check in that code today ;-)  It
inherently relies on that the name `x` inside the listcomp refers to
two entirely different scopes, and that's Poor Practice (the `x` in
the `related_coords()` call refers to the `x` in `for x in range(10)`,
but all other instances of `x` refer to the listcomp-local `x`).


>     # This is not (given the "let's reintroduce leaking from comprehensions" proposal)
>     for x in range(10):
>         for y in range(10):
>             related_interesting_coords = [x, y for x in related_x_coord(x, y)
>                                                             if is_interesting(y := f(x))]

Same syntax error there (you need parens around "x, y" at the start of
the listcomp).

Presumably they _intended_ to build (x, f(x)) pairs when and only when
`f(x)` "is interesting".  In what specific way does the code fail to
do that?  Yes, the outer `y` is rebound, but what of it?  When the
statement completes, `y` will be rebound to the next value from the
inner range(10), and that's the value of `y` seen by
`related_x_coord(x, y)` the next time the loop body runs.  The binding
done by `:=` is irrelevant to that.

So I don't see your point in that specific example, although - sure! -
of course it's possible to contrive examples where it really would
matter.  For example, change the above in some way to use `x` as the
binding operator target inside the listcomp.  Then that _could_ affect
the value of `x` seen by `related_x_coord(x, y)` across inner loop
iterations.


> Deliberately reintroducing stateful side effects into a nominally functional
> construct seems like a recipe for significant confusion,

Side effects of any kind anywhere can create significant confusion.
But Python is not a functional language, and it you don't want side
effects due to ":=" in synthetic functions, you're not required to use
":=" in that context.  That said, I agree "it would be nice" if
advanced users had a way to explicitly say which scope they want.


> even if there are some cases where it might arguably be useful to folks
> that don't want to write a named function that returns multiple values instead.

Sorry, I didn't follow that - functions returning multiple values?


More information about the Python-ideas mailing list