[Python-ideas] A comprehension scope issue in PEP 572
Tim Peters
tim.peters at gmail.com
Thu May 10 13:36:56 EDT 2018
[Nick Coghlan <ncoghlan at gmail.com> ]
> How would you expect this to work in cases where the generator expression
> isn't immediately consumed? If "p" is nonlocal (or global) by default, then
> that opens up the opportunity for it to be rebound between generator steps.
> That gets especially confusing if you have multiple generator expressions in
> the same scope iterating in parallel using the same binding target:
I'm most interested in what sensible programmers can do easily that's
of use, not really about pathologies that can be contrived.
> # This is fine
> gen1 = (p for p in range(10))
> gen2 = (p for p in gen1)
> print(list(gen2))
Sure.
>
> # This is not (given the "let's reintroduce leaking from comprehensions" proposal)
Be fair: it's not _re_introducing anything. It's brand new syntax
for which "it's a very much intended feature" that a not-local name
can be bound. You have to go out of your way to use it. Where it
doesn't do what you want, don't use it.
> p = 0
I'm not sure of the intent of that line. If `p` is otherwise unknown
in this block, its appearance as a binding operator target in an
immediately contained genexp establishes that `p` is local to this
block. So `p = 0` here just establishes that directly. Best I can
guess, the 0 value is never used below.
> gen1 = (p := q for q in range(10))
I expect that's a compile time error, grouping as
gen1 = (p := (q for q in range(10)))
but without those explicit parentheses delimiting the "genexp part" it
may not be _recognized_ as being a genexp. With the extra parens, it
binds both `gen1` and `p` to the genexp, and `p` doesn't appear in the
body of the genexp at all. Or did you intend
gen1 = ((p := q) for q in range(10))
? I'll assume that's so.
> gen2 = (p, p := q for q in gen1)
OK, I really have no guess about the intent there. Note that
gen2 = (p, q for q in gen1)
is a syntax error today, while
gen2 = (p, (q for q in gen1))
builds a 2-tuple. Perhaps
gen2 = ((p, p := q) for q in gen1)
was intended?
Summarizing:
gen1 = ((p := q) for q in range(10))
gen2 = ((p, p := q) for q in gen1)
is my best guess.
> print(list(gen2))
[(0, 0), (1, 1), (2, 2), ..., (9, 9)]
But let's not pretend it's impossible to do that today; e.g., this
code produces the same:
class Cell:
def __init__(self, value=None):
self.bind(value)
def bind(self, value):
self.value = value
return value
p = Cell()
gen1 = (p.bind(q) for q in range(10))
gen2 = ((p.value, p.bind(q)) for q in gen1)
print(list(gen2))
Someone using ":=" INTENDS to bind the name, just as much as someone
deliberately using that `Cell` class.
> It also reintroduces the original problem that comprehension scopes solved,
> just in a slightly different form:
>
> # This is fine
> for x in range(10):
> for y in range(10):
> transposed_related_coords = [y, x for x, y in related_coords(x, y)]
I'm not clear on what "This is fine" means, other than that the code
does whatever it does. That's part of why I so strongly prefer
real-life use cases. In the code above, I can't imagine what the
intent of the code might be _unless_ they're running tons of
otherwise-useless code for _side effects_ performed by calling
`related_coords()`. If "it's functional", they could do the same via
x = y = 9
transposed_related_coords = [y, x for x, y in related_coords(x, y)]
except that's a syntax error ;-) I assume
transposed_related_coords = [(y, x) for x, y in related_coords(x, y)]
was intended.
BTW, I'd shoot anyone who tried to check in that code today ;-) It
inherently relies on that the name `x` inside the listcomp refers to
two entirely different scopes, and that's Poor Practice (the `x` in
the `related_coords()` call refers to the `x` in `for x in range(10)`,
but all other instances of `x` refer to the listcomp-local `x`).
> # This is not (given the "let's reintroduce leaking from comprehensions" proposal)
> for x in range(10):
> for y in range(10):
> related_interesting_coords = [x, y for x in related_x_coord(x, y)
> if is_interesting(y := f(x))]
Same syntax error there (you need parens around "x, y" at the start of
the listcomp).
Presumably they _intended_ to build (x, f(x)) pairs when and only when
`f(x)` "is interesting". In what specific way does the code fail to
do that? Yes, the outer `y` is rebound, but what of it? When the
statement completes, `y` will be rebound to the next value from the
inner range(10), and that's the value of `y` seen by
`related_x_coord(x, y)` the next time the loop body runs. The binding
done by `:=` is irrelevant to that.
So I don't see your point in that specific example, although - sure! -
of course it's possible to contrive examples where it really would
matter. For example, change the above in some way to use `x` as the
binding operator target inside the listcomp. Then that _could_ affect
the value of `x` seen by `related_x_coord(x, y)` across inner loop
iterations.
> Deliberately reintroducing stateful side effects into a nominally functional
> construct seems like a recipe for significant confusion,
Side effects of any kind anywhere can create significant confusion.
But Python is not a functional language, and it you don't want side
effects due to ":=" in synthetic functions, you're not required to use
":=" in that context. That said, I agree "it would be nice" if
advanced users had a way to explicitly say which scope they want.
> even if there are some cases where it might arguably be useful to folks
> that don't want to write a named function that returns multiple values instead.
Sorry, I didn't follow that - functions returning multiple values?
More information about the Python-ideas
mailing list