A real life example of "given"

I thought I would share a recent use I had for "given": I have this comprehension: potential_updates = {y: command.create_potential_update(y) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()]} I want to filter out values that are None. I don't want to call the function call twice, so I have to resort to using a loop and appending or the for z in [y] trick. With "given", I can write: potential_updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] given potential_update = command.create_potential_update(y) if potential_update is not None} I also wanted to point out that code like this is self-commenting because the key and value of the comprehension can be given names.

On Wed, May 30, 2018 at 02:42:21AM -0700, Neil Girdhar wrote:
I'm not sure if that would be legal for the "given" syntax. As I understand it, the "given" syntax is: expression given name = another_expression but you've got half of the comprehension stuffed in the gap between the leading expression and the "given" keyword: expression COMPREH- given name = another_expression -ENSION so I think that's going to be illegal. I think it wants to be written this way: potential_updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None given potential_update = command.create_potential_update(y) } Or maybe it should be this? potential_updates = { y: potential_update given potential_update = command.create_potential_update(y) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None } I'm damned if I know which way is correct. Either of them? Neither? In comparison, I think that := is much simpler. There's only one place it can go: potential_updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if ( potential_update := command.create_potential_update(y) ) is not None } -- Steve

In comparison, I think that := is much simpler.
In this case that's true, but a small modification: updates = { y: do_something_to(potential_update) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None given potential_update = command.create_potential_update(y) } Shows the flexibility of this given syntax vs ":=" If we think of "given" as just inserting a line with variable-definitions before the preceding statement, it seems clear that: updates = { y: potential_update given potential_update = command.create_potential_update(y) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None } Should raise a NameError: name 'potential_update' is not defined, and updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] given potential_update = command.create_potential_update(y) if potential_update is not None } Should raise a NameError: name 'y' is not defined. For safety it seems reasonable that if a variable is "given" in a comprehension, trying to refer to it (even if it defined in the enclosing scope) before the inner-definition will result in a NameError. On Wed, May 30, 2018 at 2:22 PM, Steven D'Aprano <steve@pearwood.info> wrote:

On Wed, May 30, 2018 at 11:32 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
The reason I want it like that for comprehensions is that I think of it as equivalent to: updates = {} for x in need_initialization_nodes: for y in [x, *x.synthetic_inputs()]: potential_update = command.create_potential_update(y) if potential_update is not None: updates[y] = potential_update But you're right that this would be a second addition to the grammar. One addition would be to "test" for something like test: bool_test [comp_given] bool_test: or_test ['if' or_test 'else' test] | lambdef comp_given: 'given' testlist_star_expr annassign The second would permit the usage in comprehensions: comp_iter: comp_for | comp_if | comp_given Best, Neil For safety it seems reasonable that if a variable is "given" in a

On Thu, May 31, 2018 at 1:23 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I don't understand what you're showcasing here. With :=, you give a name to something at the exact point that it happens: updates = { y: do_something_to(potential_update) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if (potential_update := command.create_potential_update(y)) is not None } Personally, I'd use a shorter name for something that's used in such a small scope (same as you use one-letter "x" and "y"). But that's the only way that the 'given' syntax looks at all better - by encouraging you to use yet another line, it conceals some of its immense verbosity. (Note how the name "potential_update" is used twice with :=, once to set and one to retrieve; but with given, it's used three times - retrieve, retrieve, and set.) How does this show that 'given' is more flexible? ChrisA

Peter wrote:
Well you could just do:
z = {a: b for b in (transform(bi) for bi in bs) for a in as_} That works, but I prefer the implicit nesting of a sequence of "comp_for" expressions to a the nested generator. On Wed, May 30, 2018 at 2:16 PM Chris Angelico <rosuav@gmail.com> wrote:
I feel you. I think of "given" as an assignment that is in front of the expression, just like "for" (in comp_for) is a for loop that is in front, and "if" (in comp_if) is a condition that is in front.

On Thu, May 31, 2018 at 04:06:51AM +1000, Chris Angelico wrote:
Possibly you shouldn't have tried reading at 4am. Either that or I shouldn't be reading before I've had a coffee :-) Have I missed something that you have seen? Even if the syntax were legal, that seems to be a pointless use of an assignment expression. Since the new name "transformed_b" is only used once, we can and should just use the transform(b) in place: z = {a: transform(b) for b in bs for a in as_} If we need to use it twice, we can do this: # assume "@" stands in for something useful z = {a: (transformed_b := transform(b)) @ transformed_b for b in bs for a in as_} I'm not seeing the advantage of given, or any extra flexibility here, unless the aim is to encourage people to make syntax errors :-) What have I missed? -- Steve

On Thu, May 31, 2018 at 10:05 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Yep, as mentioned in the other post. The fact that you talk like this about it - asserting that it's obvious what this does, while still considering it to be utterly useless - is proof, IMO, that this should be frowned upon in style guides. ChrisA

On Wed, May 30, 2018 at 8:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Chris just explained it to you. You're calling transform too often.
What have I missed?
Like you say, := and given both work for expressions. "given" could
The flexibility of "given" is in giving names to elements of expressions and comprehensions to avoid recalculation. theoretically also be used in comprehensions.

On Wed, May 30, 2018 at 01:59:37PM -0400, Neil Girdhar wrote:
Is that even legal? Again, you're putting half of the comprehension in the middle of the given expression. I believe that "given" expression syntax is: expression given name = another_expression it's not a syntactic form that we can split across arbitrary chunks of code: # surely this won't be legal? def method(self, arg, x=spam): body given spam = expression Comprehension syntax in this case is: {key:expr for b in it1 for a in it2} (of course comprehensions can also include more loops and if clauses, but this example doesn't use those). So you've interleaved part of the given expression and part of the comprehension: {key: expression COMPRE- given name = another_expression -HENSION} That's the second time you've done that. Neil, if my analysis is correct, I think you have done us a great service: showing that the "given" expression syntax really encourages people to generate syntax errors in their comprehensions.
There is no nice, equivalent := version as far as I can tell.
Given (pun intended) the fact that you only use transformed_b in a single place, I don't think it is necessary to use := at all. z = {a: transform(b) for b in bs for a in as_} But if you really insist: # Pointless use of := z = {a: (transformed_b := transform(b)) for b in bs for a in as_} -- Steve

On Thu, May 31, 2018 at 9:53 AM, Steven D'Aprano <steve@pearwood.info> wrote:
That's the subtlety of the 'given' usage here. You fell for the same trap I did: thinking "it's only used once". Actually, what he has is equivalent to: z = {a: tb for b in bs for tb in [transform(b)] for a in as_} which means it evaluates transform(b) once regardless of the length of as_. But it's really REALLY not obvious. That's why I actually prefer the "interpolated 'for' loop" notation, despite it being distinctly distasteful in general. At least it's obvious that something weird is happening, so you don't instantly assume that you can inline the single usage. ChrisA

On Thu, May 31, 2018 at 10:05:33AM +1000, Chris Angelico wrote:
But it is only used once. I meant once per loop. It isn't used in the "for a in as_" inner loop, there's no "if transformed_b" condition, and it only is used once in the key:value part of the comprehension.
Actually, what he has is equivalent to:
z = {a: tb for b in bs for tb in [transform(b)] for a in as_}
Which also uses tb only once, making it a Useless Use Of Assignment. (I assume we're not calling transform() for some side-effect, like logging a message, or erasing your hard drive.)
which means it evaluates transform(b) once regardless of the length of as_.
Ah yes, I see what you mean. Expanded to a loop: for b in bs: tb = transform(b) for a in as_: z[a] = tb It's a little ugly, but there's a trick I already use today: py> [x+y for x in "abc" if print(x) or True for y in "de"] a b c ['ad', 'ae', 'bd', 'be', 'cd', 'ce'] So we can adapt that to assignment instead of output: # Don't do this! z = {a: tb for b in bs if (tb := transform(b)) or True for a in as_} But I wouldn't do that. If I'm concerned about the call to transform (because it is super expensive, say) then I set up a pipeline: tbs = (transform(b) for b in bs) # or map(transform, bs) z = {a: tb for tb in tbs for a in as_} The first generator comprehension can be easily embedded in the other: z = {a: tb for tb in (transform(b) for b in bs) for a in as_} This makes it super-obvious that transform is called for each b, not for each (b, a) pair, it works today, and there's no assignment expression needed at all. Assignment expressions should not be about adding yet a third way to solve a problem that already has a perfectly good solution! ("Expand to a loop statement" is not a *perfectly* good solution.) To showcase assignment expressions, we should be solving problems that don't have a good solution now. I'm still not convinced that Neil's "given" example will even work (see below) but *if he is right* that it does, perhaps that's a good reason to prefer the simpler := assignment expression syntax, since we're less likely to use it in confusing ways.
But it's really REALLY not obvious.
But is it even legal? As I understand it, "given" is an expression, not an addition to comprehension syntax. In that case, I don't think Neil's example will work at all, for reasons I've already stated. If that's not the case, then until somebody tells me what this new comprehension syntax means, and what it looks like, I have no idea what is intended. Which of these can we write, and what do they do? [expression given name=something for x in seq] [expression for x given name=something in seq] [expression for x in seq given name=something] [expression for x in seq if given name=something condition] [expression for x in seq if condition given name=something] -- Steve

On Wed, May 30, 2018 at 9:02 PM Steven D'Aprano <steve@pearwood.info> wrote:
Great question. The trick is to just write them as a sequence of statements without changing the order except to put the expression last.
[expression given name=something for x in seq]
retval = [] name = something for x in seq: retval.append(expression) return retval
[expression for x given name=something in seq]
this one doesn't make sense. [expression for x in seq given name=something]
retval = [] for x in seq: name = something retval.append(expression) return retval
[expression for x in seq if given name=something condition]
this one doesn't make sense.
[expression for x in seq if condition given name=something]
retval = []
for x in seq: if condition: name = something retval.append(expression) return retval and of course, the original proposal expression given name=something means: name = something retval = expression return retval

Okay, I though about it some more, and I think I'm mistaken about the possibility of adding both rules to the grammar since in that case it is ambiguous whether given binds more tightly to a trailing expression or to the comp_iter. It's too bad. On Wed, May 30, 2018 at 10:50 PM Neil Girdhar <mistersheik@gmail.com> wrote:

On Thu, May 31, 2018 at 4:50 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
That's a little strange confusing then, because, given the way given is used outside of comprehensions, you would expect for x in range(3): y given y=2*x [y given y=2*x for x in range(3)] to return [0, 2, 4], but it would actually raise an error.

* Sorry, message sent too early: On Thu, May 31, 2018 at 4:50 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
That's a little confusing then, because, given the way given is used outside of comprehensions, you would expect [y given y=2*x for x in range(3)] to return [0, 2, 4], but it would actually raise an error. On Thu, May 31, 2018 at 10:32 AM, Peter O'Connor <peter.ed.oconnor@gmail.com
wrote:

Yes, you're right. That's the ambiguity I mentioned in my last message. It's too bad because I want given for expressions and given for comprehensions. But if you have both, there's ambiguity and you would at least need parentheses: [(y given y=2*x) for x in range(3)] That might be fine. On Thu, May 31, 2018 at 4:34 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:

Well, there need not be any ambiguity if you think of "B given A" as "execute A before B", and remember that "given" has a lower precedence than "for" (So [B given A for x in seq] is parsed as [(B given A) for x in seq] Then
retval = [expr(name) given name=something(x) for x in seq]
Is: retval = [] for x in seq: name = something(x) retval.append(expr(name)) And retval = [expr(name, x) for x in seq given name=something] Is: retval = [] name = something for x in seq: retval.append(expr(name, x)) But this is probably not a great solution, as it forces you to mentally unwrap comprehensions in a strange order and remember a non-obvious precedence rule. On the plus-side, it lets you initialize generators with in-loop updates (which cannot as far as I see be done nicely with ":="): retval = [expr(name, x) given name=update(name, x) for x in seq given name=something] Is: retval = [] name = something for x in seq: name = update(name, x) retval.append(expr(name, x)) On Thu, May 31, 2018 at 10:44 AM, Neil Girdhar <mistersheik@gmail.com> wrote:

On Thu, May 31, 2018 at 5:39 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
You hit the nail on the head. It forces you to unwarp comprehensions in a strange order. This is why I want the "given" to be interspersed with the "for" and "if" and for everything to be in the order you declare it.
Why wouldn't you want to just put the outer given outside the entire comprehension? retval = [expr(name, x) given name=update(name, x) for x in seq] given name=something The more I think about it, the more i want to keep "given" in comprehensions, and given in expressions using parentheses when given is supposed to bind to the expression first.

On Thu, May 31, 2018 at 1:55 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
There seems to be a lot of controversy about updating variables defined outside a comprehension within a comprehension. Seems like it could lead to a lot of bugs and unintended consequences, and that it's safer to not allow side-effects of comprehensions.
I think the problem is that you given to be used in two ways: - You want "B given A" to mean "execute A before B" when B is a simple expression - "given A B" to mean "execute A before B" when B is a loop declaration. The more I think about it, the more I think comprehensions should be parsed in reverse order: "from right to left". In this alternate world, your initial example would have been: potential_updates = { (1) y: command.create_potential_update(y) (2) if potential_update is not None (3) given potential_update = command.create_potential_update(y) (4) for y in [x, *x.synthetic_inputs()] (5) for x in need_initialization_nodes } Which would translate to: potential_updates = {} (5) for x in need_initialization_nodes: (4) for y in [x, *x.synthetic_inputs()]: (3) potential_update = command.create_potential_update(y) (2) if potential_update is not None: (1) potential_updates[y] = command.create_potential_ update(y) And there would be no ambiguity about the use of given. Also, variables would tend to be used closer to their declarations. But it's way to late to make a change like that to Python.

On Thu, May 31, 2018 at 02:22:21PM +0200, Peter O'Connor wrote:
Gosh, you mean that a feature intended to have side-effects might have side-effects? Who would have predicted that! *wink* I have to keep pointing this out, because critics keep refusing to acknowledge this point: one of the major motivating use-cases of assignment expressions specifically relies on the ability to update a local variable from inside a comprehension. If not for that use-case, we probably wouldn't be having this discussion at all. I think it ought to take more than "controversy" to eliminate that use-case from consideration. It ought to take a good demonstration that this is a real, not just theoretical, problem, that it *encourages* bugs not just allows them. *Any* use of variables can "lead to a lot of bugs and unintended consequences" (just ask functional programmers). Yes, it can, but it usually doesn't. Bottom line is, if you think it is okay that the following assignment to x affects the local scope: results = [] for a in seq: # using "given" to avoid arguments about := y = (x given x = a)+1 results.append(y) assert "x" in locals() but then worry that changing the loop to a comprehension: results = [(x given x = a)+1 for a in seq] assert "x" in locals() will be a problem, then I think you are applying an unreasonably strict standard of functional purity towards comprehensions, one which is not justified by Python's consenting adults approach to side-effects or the fact that comprehensions can already have side-effects. (Sorry Nick!) [Aside: yes, I realise the assertions will fail if seq is empty. It's just an illustration, not production code.] -- Steve

On 2018-05-31 05:53, Steven D'Aprano wrote:
What I don't understand is this: if we believe that, then why was comprehension-leaking EVER removed? Everything that I've seen advocating for this kind of leaking seems to me like it is much more logically consistent with allowing all comprehension variables to leak than it is with the current behavior, in which they don't leak. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On Thu, May 31, 2018 at 11:03:56AM -0700, Brendan Barnwell wrote:
Originally list comprehensions ran in the same scope as their surroundings. In Python 2: py> [1 for x in ("spam", "eggs")] [1, 1] py> x 'eggs' but when generator expressions were introduced a few years later, they ran in their own sub-local scope. This was, I believe, initially introduced to simplify the common situation that you return a generator from inside a function: def factory(): x = "something big" return (x for x in seq) would require holding onto a closure of the factory locals, including the "something big" object. Potentially forever, if the generator expression is never iterated over. (I hope someone will correct me if I have misunderstood.) To avoid that, the x in the generator was put into a distinct scope from the x in factory. Either way, the PEP introducing generator expressions makes it clear that it was a deliberate decision: The loop variable (if it is a simple variable or a tuple of simple variables) is not exposed to the surrounding function. This facilitates the implementation and makes typical use cases more reliable. https://www.python.org/dev/peps/pep-0289/ In the case of loop variables, there is an argument from "practicality beats purity" that they ought to run in their own scope. Loop variables tend to be short, generic names like "i", "x", "obj", all the more likely to clash with names in the surrounding scope, and hard to spot when they do. They're also likely to be used inside loops: for x in [1, 2, 3, 4]: alist = [expr for x in range(50)] # oops, x has been accidentally overridden (in Python 2) I don't *entirely* buy that argument, and I occasionally find it useful to inspect the loop variable of a list comprehension after it has run, but this is one windmill I'm not going to tilt against. Reversing the decision to put the loop variables in their own sublocal scope is *not* part of this PEP. But this is less likely to be a problem for explicit assignment. Outside of toy examples, we're more likely to assign to descriptive names and less likely to clash with any surrounding loop variable: for book in library: text = [content for chapter in books.chapters() if (content := chapter.get_text(all=True) and re.match(pattern, content)] Given an obvious and explicit assignment to "chapter", say, we are more likely to realise when we are reusing a name (compared to assigning to a loop variable i, say). At least, we should be no more likely to mess this up than we are for any other local-level assignment. -- Steve

On Thu, May 31, 2018 at 04:44:18AM -0400, Neil Girdhar wrote:
Why? So far you haven't given (heh, pun intended) any examples of something you can do better with "given for comprehensions" which isn't either already doable or will be doable with assignment expressions (regardless of spelling). Earlier, I wrote: "To showcase assignment expressions, we should be solving problems that don't have a good solution now." (I exclude "re-write your code as a for-loop statement" -- I consider that a last resort, not the best solution.) Now I realise that good solutions are in the eye of the beholder, but I think we (mostly) agree that: [process(x, 2*x, x**3) for obj in seq for x in [func(obj)]] is a hacky solution for assignments in an expression. It works, but it hardly speaks to the programmers intention. Whichever syntax we use, an explicit assignment expression is better: # verbose, Repeat Yourself syntax [process(x given x = func(obj), 2*x, x**3) for obj in seq] # concise, Don't Repeat Yourself syntax [process(x := func(obj), 2*x, x**3) for obj in seq] (I don't apologise for the editorial comments.) Do you have an equally compelling example for your given-comprehension syntax? I didn't think your example was obviously better than what we can already do: # calculate tx only once per x loop [process(tx, y) for x in xs given tx = transform(x) for y in ys] # existing solution [process(tx, y) for tx in (transform(x) for x in xs) for y in yz] Regardless of whether it is spelled := or given, I don't think that this example is a compelling use-case for assignment expressions. I think there are much better use-cases. (E.g. avoiding cascades of nested if statements.) -- Steve

On Thu, May 31, 2018 at 10:24 PM, Steven D'Aprano <steve@pearwood.info> wrote:
# alternate existing solution [process(tx, y) for x in xs for tx in [transform(x)] for y in yz] This syntax allows you to use both x and tx in the resultant expression. For instance: [process(value, row_total) for row in dataset for row_total in [sum(row)] for value in row] ret = [] for row in dataset: row_total = sum(row) for value in row: ret.append(process(value, row_total)) If done without optimization, each row would take O(n²) time, but this way it's O(n). I think Serhiy was trying to establish this form as a standard idiom, with optimization in the interpreter to avoid constructing a list and iterating over it (so it would be functionally identical to actual assignment). I'd rather see that happen than the creation of a messy 'given' syntax. ChrisA

On Thu, May 31, 2018 at 2:55 PM, Chris Angelico <rosuav@gmail.com> wrote:
[process(tx, y) for x in xs for tx in [transform(x)] for y in yz]
... I think Serhiy was trying to establish this form as a standard idiom,
Perhaps it wouldn't be crazy to have "with name=initial" be that idiom instead of "for name in [initial]". As .. [process(tx, y) for x in xs with tx=transform(x) for y in yz] .. seems to convey the intention more clearly. More generally (outside of just comprehensions), "with name=expr:" could be used to temporarily bind "name" to "expr" inside the scope of the with-statement (and unbind it at the end). And then I could have my precious initialized generators (which I believe cannot be nicely implemented with ":=" unless we initialize the variable outside of the scope of the comprehension, which introduces the problem of unintended side-effects). smooth_signal = [average with average=0 for x in seq with average=(1-decay)*average + decay*x]

On Thu, May 31, 2018 at 11:23 PM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Except that 'with' means context managers, not just assignment. Also, it's not backward-compatible; if the "for var in [val]" syntax becomes an accepted idiom, it'll be valid in all versions of Python back to, what, 2.4? and just won't be optimized in older versions. Making a new syntax misses out on that benefit, so it needs to be a really good syntax, and 'with' isn't.
I want to read this as "average with average=0" blah blah, which doesn't make a lot of sense. It'd be far FAR better to mess with things so that external assignment works. ChrisA

On Wed, May 30, 2018 at 7:54 PM Steven D'Aprano <steve@pearwood.info> wrote:
In case you missed my earlier reply to you: One addition to the grammar would be to "test" for something like test: bool_test [comp_given] bool_test: or_test ['if' or_test 'else' test] | lambdef comp_given: 'given' testlist_star_expr annassign The second would permit the usage in comprehensions: comp_iter: comp_for | comp_if | comp_given
Those call transform for every a needlessly.

What's wrong with making this two lines? In [1]: import random In [2]: xs = [10, 20, 30] In [3]: def foo(x): ...: return [x + i for i in range(3)] ...: ...: In [4]: def bar(y): ...: if random.random() < 0.3: ...: return None ...: return str(y) ...: ...: In [5]: ys = ((y, bar(y)) for x in xs for y in foo(x)) In [6]: {y: result for y, result in ys if result is not None} Out[6]: {10: '10', 11: '11', 20: '20', 21: '21', 22: '22', 30: '30', 32: '32'}

On Wed, May 30, 2018 at 02:42:21AM -0700, Neil Girdhar wrote:
I'm not sure if that would be legal for the "given" syntax. As I understand it, the "given" syntax is: expression given name = another_expression but you've got half of the comprehension stuffed in the gap between the leading expression and the "given" keyword: expression COMPREH- given name = another_expression -ENSION so I think that's going to be illegal. I think it wants to be written this way: potential_updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None given potential_update = command.create_potential_update(y) } Or maybe it should be this? potential_updates = { y: potential_update given potential_update = command.create_potential_update(y) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None } I'm damned if I know which way is correct. Either of them? Neither? In comparison, I think that := is much simpler. There's only one place it can go: potential_updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if ( potential_update := command.create_potential_update(y) ) is not None } -- Steve

In comparison, I think that := is much simpler.
In this case that's true, but a small modification: updates = { y: do_something_to(potential_update) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None given potential_update = command.create_potential_update(y) } Shows the flexibility of this given syntax vs ":=" If we think of "given" as just inserting a line with variable-definitions before the preceding statement, it seems clear that: updates = { y: potential_update given potential_update = command.create_potential_update(y) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if potential_update is not None } Should raise a NameError: name 'potential_update' is not defined, and updates = { y: potential_update for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] given potential_update = command.create_potential_update(y) if potential_update is not None } Should raise a NameError: name 'y' is not defined. For safety it seems reasonable that if a variable is "given" in a comprehension, trying to refer to it (even if it defined in the enclosing scope) before the inner-definition will result in a NameError. On Wed, May 30, 2018 at 2:22 PM, Steven D'Aprano <steve@pearwood.info> wrote:

On Wed, May 30, 2018 at 11:32 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
The reason I want it like that for comprehensions is that I think of it as equivalent to: updates = {} for x in need_initialization_nodes: for y in [x, *x.synthetic_inputs()]: potential_update = command.create_potential_update(y) if potential_update is not None: updates[y] = potential_update But you're right that this would be a second addition to the grammar. One addition would be to "test" for something like test: bool_test [comp_given] bool_test: or_test ['if' or_test 'else' test] | lambdef comp_given: 'given' testlist_star_expr annassign The second would permit the usage in comprehensions: comp_iter: comp_for | comp_if | comp_given Best, Neil For safety it seems reasonable that if a variable is "given" in a

On Thu, May 31, 2018 at 1:23 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I don't understand what you're showcasing here. With :=, you give a name to something at the exact point that it happens: updates = { y: do_something_to(potential_update) for x in need_initialization_nodes for y in [x, *x.synthetic_inputs()] if (potential_update := command.create_potential_update(y)) is not None } Personally, I'd use a shorter name for something that's used in such a small scope (same as you use one-letter "x" and "y"). But that's the only way that the 'given' syntax looks at all better - by encouraging you to use yet another line, it conceals some of its immense verbosity. (Note how the name "potential_update" is used twice with :=, once to set and one to retrieve; but with given, it's used three times - retrieve, retrieve, and set.) How does this show that 'given' is more flexible? ChrisA

On Thu, May 31, 2018 at 3:59 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
True. However, it took me several readings to understand what you were doing here. I think I actually prefer "for transformed_b in [transform(b)]" to this syntax, which is saying something. ChrisA

Peter wrote:
Well you could just do:
z = {a: b for b in (transform(bi) for bi in bs) for a in as_} That works, but I prefer the implicit nesting of a sequence of "comp_for" expressions to a the nested generator. On Wed, May 30, 2018 at 2:16 PM Chris Angelico <rosuav@gmail.com> wrote:
I feel you. I think of "given" as an assignment that is in front of the expression, just like "for" (in comp_for) is a for loop that is in front, and "if" (in comp_if) is a condition that is in front.

On Thu, May 31, 2018 at 04:06:51AM +1000, Chris Angelico wrote:
Possibly you shouldn't have tried reading at 4am. Either that or I shouldn't be reading before I've had a coffee :-) Have I missed something that you have seen? Even if the syntax were legal, that seems to be a pointless use of an assignment expression. Since the new name "transformed_b" is only used once, we can and should just use the transform(b) in place: z = {a: transform(b) for b in bs for a in as_} If we need to use it twice, we can do this: # assume "@" stands in for something useful z = {a: (transformed_b := transform(b)) @ transformed_b for b in bs for a in as_} I'm not seeing the advantage of given, or any extra flexibility here, unless the aim is to encourage people to make syntax errors :-) What have I missed? -- Steve

On Thu, May 31, 2018 at 10:05 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Yep, as mentioned in the other post. The fact that you talk like this about it - asserting that it's obvious what this does, while still considering it to be utterly useless - is proof, IMO, that this should be frowned upon in style guides. ChrisA

On Wed, May 30, 2018 at 8:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Chris just explained it to you. You're calling transform too often.
What have I missed?
Like you say, := and given both work for expressions. "given" could
The flexibility of "given" is in giving names to elements of expressions and comprehensions to avoid recalculation. theoretically also be used in comprehensions.

On Wed, May 30, 2018 at 01:59:37PM -0400, Neil Girdhar wrote:
Is that even legal? Again, you're putting half of the comprehension in the middle of the given expression. I believe that "given" expression syntax is: expression given name = another_expression it's not a syntactic form that we can split across arbitrary chunks of code: # surely this won't be legal? def method(self, arg, x=spam): body given spam = expression Comprehension syntax in this case is: {key:expr for b in it1 for a in it2} (of course comprehensions can also include more loops and if clauses, but this example doesn't use those). So you've interleaved part of the given expression and part of the comprehension: {key: expression COMPRE- given name = another_expression -HENSION} That's the second time you've done that. Neil, if my analysis is correct, I think you have done us a great service: showing that the "given" expression syntax really encourages people to generate syntax errors in their comprehensions.
There is no nice, equivalent := version as far as I can tell.
Given (pun intended) the fact that you only use transformed_b in a single place, I don't think it is necessary to use := at all. z = {a: transform(b) for b in bs for a in as_} But if you really insist: # Pointless use of := z = {a: (transformed_b := transform(b)) for b in bs for a in as_} -- Steve

On Thu, May 31, 2018 at 9:53 AM, Steven D'Aprano <steve@pearwood.info> wrote:
That's the subtlety of the 'given' usage here. You fell for the same trap I did: thinking "it's only used once". Actually, what he has is equivalent to: z = {a: tb for b in bs for tb in [transform(b)] for a in as_} which means it evaluates transform(b) once regardless of the length of as_. But it's really REALLY not obvious. That's why I actually prefer the "interpolated 'for' loop" notation, despite it being distinctly distasteful in general. At least it's obvious that something weird is happening, so you don't instantly assume that you can inline the single usage. ChrisA

On Thu, May 31, 2018 at 10:05:33AM +1000, Chris Angelico wrote:
But it is only used once. I meant once per loop. It isn't used in the "for a in as_" inner loop, there's no "if transformed_b" condition, and it only is used once in the key:value part of the comprehension.
Actually, what he has is equivalent to:
z = {a: tb for b in bs for tb in [transform(b)] for a in as_}
Which also uses tb only once, making it a Useless Use Of Assignment. (I assume we're not calling transform() for some side-effect, like logging a message, or erasing your hard drive.)
which means it evaluates transform(b) once regardless of the length of as_.
Ah yes, I see what you mean. Expanded to a loop: for b in bs: tb = transform(b) for a in as_: z[a] = tb It's a little ugly, but there's a trick I already use today: py> [x+y for x in "abc" if print(x) or True for y in "de"] a b c ['ad', 'ae', 'bd', 'be', 'cd', 'ce'] So we can adapt that to assignment instead of output: # Don't do this! z = {a: tb for b in bs if (tb := transform(b)) or True for a in as_} But I wouldn't do that. If I'm concerned about the call to transform (because it is super expensive, say) then I set up a pipeline: tbs = (transform(b) for b in bs) # or map(transform, bs) z = {a: tb for tb in tbs for a in as_} The first generator comprehension can be easily embedded in the other: z = {a: tb for tb in (transform(b) for b in bs) for a in as_} This makes it super-obvious that transform is called for each b, not for each (b, a) pair, it works today, and there's no assignment expression needed at all. Assignment expressions should not be about adding yet a third way to solve a problem that already has a perfectly good solution! ("Expand to a loop statement" is not a *perfectly* good solution.) To showcase assignment expressions, we should be solving problems that don't have a good solution now. I'm still not convinced that Neil's "given" example will even work (see below) but *if he is right* that it does, perhaps that's a good reason to prefer the simpler := assignment expression syntax, since we're less likely to use it in confusing ways.
But it's really REALLY not obvious.
But is it even legal? As I understand it, "given" is an expression, not an addition to comprehension syntax. In that case, I don't think Neil's example will work at all, for reasons I've already stated. If that's not the case, then until somebody tells me what this new comprehension syntax means, and what it looks like, I have no idea what is intended. Which of these can we write, and what do they do? [expression given name=something for x in seq] [expression for x given name=something in seq] [expression for x in seq given name=something] [expression for x in seq if given name=something condition] [expression for x in seq if condition given name=something] -- Steve

On Wed, May 30, 2018 at 9:02 PM Steven D'Aprano <steve@pearwood.info> wrote:
Great question. The trick is to just write them as a sequence of statements without changing the order except to put the expression last.
[expression given name=something for x in seq]
retval = [] name = something for x in seq: retval.append(expression) return retval
[expression for x given name=something in seq]
this one doesn't make sense. [expression for x in seq given name=something]
retval = [] for x in seq: name = something retval.append(expression) return retval
[expression for x in seq if given name=something condition]
this one doesn't make sense.
[expression for x in seq if condition given name=something]
retval = []
for x in seq: if condition: name = something retval.append(expression) return retval and of course, the original proposal expression given name=something means: name = something retval = expression return retval

Okay, I though about it some more, and I think I'm mistaken about the possibility of adding both rules to the grammar since in that case it is ambiguous whether given binds more tightly to a trailing expression or to the comp_iter. It's too bad. On Wed, May 30, 2018 at 10:50 PM Neil Girdhar <mistersheik@gmail.com> wrote:

On Thu, May 31, 2018 at 4:50 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
That's a little strange confusing then, because, given the way given is used outside of comprehensions, you would expect for x in range(3): y given y=2*x [y given y=2*x for x in range(3)] to return [0, 2, 4], but it would actually raise an error.

* Sorry, message sent too early: On Thu, May 31, 2018 at 4:50 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
That's a little confusing then, because, given the way given is used outside of comprehensions, you would expect [y given y=2*x for x in range(3)] to return [0, 2, 4], but it would actually raise an error. On Thu, May 31, 2018 at 10:32 AM, Peter O'Connor <peter.ed.oconnor@gmail.com
wrote:

Yes, you're right. That's the ambiguity I mentioned in my last message. It's too bad because I want given for expressions and given for comprehensions. But if you have both, there's ambiguity and you would at least need parentheses: [(y given y=2*x) for x in range(3)] That might be fine. On Thu, May 31, 2018 at 4:34 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:

Well, there need not be any ambiguity if you think of "B given A" as "execute A before B", and remember that "given" has a lower precedence than "for" (So [B given A for x in seq] is parsed as [(B given A) for x in seq] Then
retval = [expr(name) given name=something(x) for x in seq]
Is: retval = [] for x in seq: name = something(x) retval.append(expr(name)) And retval = [expr(name, x) for x in seq given name=something] Is: retval = [] name = something for x in seq: retval.append(expr(name, x)) But this is probably not a great solution, as it forces you to mentally unwrap comprehensions in a strange order and remember a non-obvious precedence rule. On the plus-side, it lets you initialize generators with in-loop updates (which cannot as far as I see be done nicely with ":="): retval = [expr(name, x) given name=update(name, x) for x in seq given name=something] Is: retval = [] name = something for x in seq: name = update(name, x) retval.append(expr(name, x)) On Thu, May 31, 2018 at 10:44 AM, Neil Girdhar <mistersheik@gmail.com> wrote:

On Thu, May 31, 2018 at 5:39 AM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
You hit the nail on the head. It forces you to unwarp comprehensions in a strange order. This is why I want the "given" to be interspersed with the "for" and "if" and for everything to be in the order you declare it.
Why wouldn't you want to just put the outer given outside the entire comprehension? retval = [expr(name, x) given name=update(name, x) for x in seq] given name=something The more I think about it, the more i want to keep "given" in comprehensions, and given in expressions using parentheses when given is supposed to bind to the expression first.

On Thu, May 31, 2018 at 1:55 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
There seems to be a lot of controversy about updating variables defined outside a comprehension within a comprehension. Seems like it could lead to a lot of bugs and unintended consequences, and that it's safer to not allow side-effects of comprehensions.
I think the problem is that you given to be used in two ways: - You want "B given A" to mean "execute A before B" when B is a simple expression - "given A B" to mean "execute A before B" when B is a loop declaration. The more I think about it, the more I think comprehensions should be parsed in reverse order: "from right to left". In this alternate world, your initial example would have been: potential_updates = { (1) y: command.create_potential_update(y) (2) if potential_update is not None (3) given potential_update = command.create_potential_update(y) (4) for y in [x, *x.synthetic_inputs()] (5) for x in need_initialization_nodes } Which would translate to: potential_updates = {} (5) for x in need_initialization_nodes: (4) for y in [x, *x.synthetic_inputs()]: (3) potential_update = command.create_potential_update(y) (2) if potential_update is not None: (1) potential_updates[y] = command.create_potential_ update(y) And there would be no ambiguity about the use of given. Also, variables would tend to be used closer to their declarations. But it's way to late to make a change like that to Python.

On Thu, May 31, 2018 at 02:22:21PM +0200, Peter O'Connor wrote:
Gosh, you mean that a feature intended to have side-effects might have side-effects? Who would have predicted that! *wink* I have to keep pointing this out, because critics keep refusing to acknowledge this point: one of the major motivating use-cases of assignment expressions specifically relies on the ability to update a local variable from inside a comprehension. If not for that use-case, we probably wouldn't be having this discussion at all. I think it ought to take more than "controversy" to eliminate that use-case from consideration. It ought to take a good demonstration that this is a real, not just theoretical, problem, that it *encourages* bugs not just allows them. *Any* use of variables can "lead to a lot of bugs and unintended consequences" (just ask functional programmers). Yes, it can, but it usually doesn't. Bottom line is, if you think it is okay that the following assignment to x affects the local scope: results = [] for a in seq: # using "given" to avoid arguments about := y = (x given x = a)+1 results.append(y) assert "x" in locals() but then worry that changing the loop to a comprehension: results = [(x given x = a)+1 for a in seq] assert "x" in locals() will be a problem, then I think you are applying an unreasonably strict standard of functional purity towards comprehensions, one which is not justified by Python's consenting adults approach to side-effects or the fact that comprehensions can already have side-effects. (Sorry Nick!) [Aside: yes, I realise the assertions will fail if seq is empty. It's just an illustration, not production code.] -- Steve

On 2018-05-31 05:53, Steven D'Aprano wrote:
What I don't understand is this: if we believe that, then why was comprehension-leaking EVER removed? Everything that I've seen advocating for this kind of leaking seems to me like it is much more logically consistent with allowing all comprehension variables to leak than it is with the current behavior, in which they don't leak. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On Thu, May 31, 2018 at 11:03:56AM -0700, Brendan Barnwell wrote:
Originally list comprehensions ran in the same scope as their surroundings. In Python 2: py> [1 for x in ("spam", "eggs")] [1, 1] py> x 'eggs' but when generator expressions were introduced a few years later, they ran in their own sub-local scope. This was, I believe, initially introduced to simplify the common situation that you return a generator from inside a function: def factory(): x = "something big" return (x for x in seq) would require holding onto a closure of the factory locals, including the "something big" object. Potentially forever, if the generator expression is never iterated over. (I hope someone will correct me if I have misunderstood.) To avoid that, the x in the generator was put into a distinct scope from the x in factory. Either way, the PEP introducing generator expressions makes it clear that it was a deliberate decision: The loop variable (if it is a simple variable or a tuple of simple variables) is not exposed to the surrounding function. This facilitates the implementation and makes typical use cases more reliable. https://www.python.org/dev/peps/pep-0289/ In the case of loop variables, there is an argument from "practicality beats purity" that they ought to run in their own scope. Loop variables tend to be short, generic names like "i", "x", "obj", all the more likely to clash with names in the surrounding scope, and hard to spot when they do. They're also likely to be used inside loops: for x in [1, 2, 3, 4]: alist = [expr for x in range(50)] # oops, x has been accidentally overridden (in Python 2) I don't *entirely* buy that argument, and I occasionally find it useful to inspect the loop variable of a list comprehension after it has run, but this is one windmill I'm not going to tilt against. Reversing the decision to put the loop variables in their own sublocal scope is *not* part of this PEP. But this is less likely to be a problem for explicit assignment. Outside of toy examples, we're more likely to assign to descriptive names and less likely to clash with any surrounding loop variable: for book in library: text = [content for chapter in books.chapters() if (content := chapter.get_text(all=True) and re.match(pattern, content)] Given an obvious and explicit assignment to "chapter", say, we are more likely to realise when we are reusing a name (compared to assigning to a loop variable i, say). At least, we should be no more likely to mess this up than we are for any other local-level assignment. -- Steve

On Thu, May 31, 2018 at 04:44:18AM -0400, Neil Girdhar wrote:
Why? So far you haven't given (heh, pun intended) any examples of something you can do better with "given for comprehensions" which isn't either already doable or will be doable with assignment expressions (regardless of spelling). Earlier, I wrote: "To showcase assignment expressions, we should be solving problems that don't have a good solution now." (I exclude "re-write your code as a for-loop statement" -- I consider that a last resort, not the best solution.) Now I realise that good solutions are in the eye of the beholder, but I think we (mostly) agree that: [process(x, 2*x, x**3) for obj in seq for x in [func(obj)]] is a hacky solution for assignments in an expression. It works, but it hardly speaks to the programmers intention. Whichever syntax we use, an explicit assignment expression is better: # verbose, Repeat Yourself syntax [process(x given x = func(obj), 2*x, x**3) for obj in seq] # concise, Don't Repeat Yourself syntax [process(x := func(obj), 2*x, x**3) for obj in seq] (I don't apologise for the editorial comments.) Do you have an equally compelling example for your given-comprehension syntax? I didn't think your example was obviously better than what we can already do: # calculate tx only once per x loop [process(tx, y) for x in xs given tx = transform(x) for y in ys] # existing solution [process(tx, y) for tx in (transform(x) for x in xs) for y in yz] Regardless of whether it is spelled := or given, I don't think that this example is a compelling use-case for assignment expressions. I think there are much better use-cases. (E.g. avoiding cascades of nested if statements.) -- Steve

On Thu, May 31, 2018 at 10:24 PM, Steven D'Aprano <steve@pearwood.info> wrote:
# alternate existing solution [process(tx, y) for x in xs for tx in [transform(x)] for y in yz] This syntax allows you to use both x and tx in the resultant expression. For instance: [process(value, row_total) for row in dataset for row_total in [sum(row)] for value in row] ret = [] for row in dataset: row_total = sum(row) for value in row: ret.append(process(value, row_total)) If done without optimization, each row would take O(n²) time, but this way it's O(n). I think Serhiy was trying to establish this form as a standard idiom, with optimization in the interpreter to avoid constructing a list and iterating over it (so it would be functionally identical to actual assignment). I'd rather see that happen than the creation of a messy 'given' syntax. ChrisA

On Thu, May 31, 2018 at 2:55 PM, Chris Angelico <rosuav@gmail.com> wrote:
[process(tx, y) for x in xs for tx in [transform(x)] for y in yz]
... I think Serhiy was trying to establish this form as a standard idiom,
Perhaps it wouldn't be crazy to have "with name=initial" be that idiom instead of "for name in [initial]". As .. [process(tx, y) for x in xs with tx=transform(x) for y in yz] .. seems to convey the intention more clearly. More generally (outside of just comprehensions), "with name=expr:" could be used to temporarily bind "name" to "expr" inside the scope of the with-statement (and unbind it at the end). And then I could have my precious initialized generators (which I believe cannot be nicely implemented with ":=" unless we initialize the variable outside of the scope of the comprehension, which introduces the problem of unintended side-effects). smooth_signal = [average with average=0 for x in seq with average=(1-decay)*average + decay*x]

On Thu, May 31, 2018 at 11:23 PM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Except that 'with' means context managers, not just assignment. Also, it's not backward-compatible; if the "for var in [val]" syntax becomes an accepted idiom, it'll be valid in all versions of Python back to, what, 2.4? and just won't be optimized in older versions. Making a new syntax misses out on that benefit, so it needs to be a really good syntax, and 'with' isn't.
I want to read this as "average with average=0" blah blah, which doesn't make a lot of sense. It'd be far FAR better to mess with things so that external assignment works. ChrisA

On Wed, May 30, 2018 at 7:54 PM Steven D'Aprano <steve@pearwood.info> wrote:
In case you missed my earlier reply to you: One addition to the grammar would be to "test" for something like test: bool_test [comp_given] bool_test: or_test ['if' or_test 'else' test] | lambdef comp_given: 'given' testlist_star_expr annassign The second would permit the usage in comprehensions: comp_iter: comp_for | comp_if | comp_given
Those call transform for every a needlessly.

What's wrong with making this two lines? In [1]: import random In [2]: xs = [10, 20, 30] In [3]: def foo(x): ...: return [x + i for i in range(3)] ...: ...: In [4]: def bar(y): ...: if random.random() < 0.3: ...: return None ...: return str(y) ...: ...: In [5]: ys = ((y, bar(y)) for x in xs for y in foo(x)) In [6]: {y: result for y, result in ys if result is not None} Out[6]: {10: '10', 11: '11', 20: '20', 21: '21', 22: '22', 30: '30', 32: '32'}
participants (6)
-
Brendan Barnwell
-
Chris Angelico
-
Michael Selik
-
Neil Girdhar
-
Peter O'Connor
-
Steven D'Aprano