Unpacking iterables for augmented assignment

I propose we apply PEP 3132 to PEP 203. That is, for every statement where "<lhs> = <rhs>" is valid I propose "lhs += rhs" should also be valid. Simple example: a = 0 b = 0 a, b += 1, 2 # a is now 1 # b is now 2

Hi James Thank you for the simple example. It makes discussing your proposal much easier. I hope you don't mind, I'll modify it a little to make the semantics clearer, at least to me. Here it is. init = (10, 20) incr = (1, 2) (a, b) = init # Now, (a, b) == (10, 20) a, b += incr # Now (a, b) == (11, 22) However, I'm not sure your suggestion sits well with the following
(10, 20) + (1, 2) (10, 20, 1, 2)
By the way, I was surprised to find that this is valid
[a, b] = 1, 2
with best regards Jonathan

James Lu writes:
This isn't consistent with the normal behavior of Python sequences: $ python3.6
(1,2) + (3,4) (1, 2, 3, 4)
That is, "+" means something different for sequences. Furthermore, the thing on the LHS is tuple syntax. There's only one reasonable meaning to give to an ordinary assignment from a sequence on the RHS to a "tuple of variables". But in your expression, the thing on the LHS wants to be an object, which can only be a concrete tuple:
which is itself immutable, as are its integer components in your example (which is not a toy, as complex expressions on the RHS would mean you could only fit two, maybe three on a line, which doesn't save much vertical space). You can argue that the same would hold for ordinary assignment, but doesn't, and that's true. However, ordinary assignment is a name-binding operation, while augmented assignment is a mutation of the underlying object. This is problematic for lists:
I don't think the list syntax has any use different from the tuple syntax. The example of list syntax makes me uncomfortable, though. I wonder if there are traps with other mutable objects. Finally, when I look at
a, b += 1, 2
as a once and future C programmer, I expect a value of (0, 1, 2) to be printed, with a == 0, and b == 1 at this point. (This is really minor; we like to avoid confusion with carryovers from other languages, but it's not enought to kill something useful.) That said, none of the above is sufficient reason to reject a useful syntax addition (augmented assignment to a tuple is currently an error). But it seems to me that compared to assignment by sequence unrolling, this generalization isn't very expressive. AIUI, one of the motivations for unrolling sequence-to-sequence assignment in this way is to provide the obvious notation for permutations:
and generalizing
You can't do any of the above without the sequence unrolling feature, without adding temporary variables. With augmented assignments, you can save some vertical space. Is there something else I'm missing? Steve

Steve and Johnathan, it seems like we're on the same page. Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>? When there are a tuple or list of names on the left hand side (ex. `a, b` or `(a, b)` or `[a, b]`), unpack the right hand side into values and perform the augmented assignment on each name on the left hand side. Borrowing Johnathan's example: a = b = 0 incr = (1, 2) a, b += incr # now (a, b) == (1, 2) concat = (-1, 0) concat += incr # existing behavior: now concat == (1, 2, 3, 4) Like Steve said, you can save vertical space and avoid creating temporary variables: temp_a, temp_b = simulate(new_deck) tally_a += temp_a tally_b += temp_b tally_a, tally_b = simulate(new_deck )

Hi James Thank you for your message. I have some general, background comments. However, the question remains: Does your proposal sit well with the rest of Python? By the way, I've just found a nice article: http://treyhunner.com/2018/03/tuple-unpacking-improves-python-code-readabili... You wrote
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
It is very important, in this context, to know what we can assign to. We can assign to some expressions, but not others. Here are some examples
When assigned to, the expression "(a, b)" has a special meaning. Strictly speaking, we are not assigning to a tuple. Hence
(fn()[1], a.b) = (0, 1) # Valid syntax. NameError: name 'fn' is not defined
Finally, here's something that surprised me a little bit
Notice that '+=' creates uses the same object when the object is a list, but creates a new object. This raises the question: Why and how does Python behave in this way? Finally, thank you for your stimulating idea. I think to take it forward you have to deal with two questions: 1. Does the proposal sit well with the rest of Python? 2. Does it IN PRACTICE bring sufficient benefits to users? Please note that Python deservedly has a reputation for being a readable language. Some will argue that allowing
Somewhere, as I recall, I read that writing a = <EXP1> b = <EXP2> is preferable for reasons of clarity to a, b = <EXP1>, <EXP2> when the two assignments are unrelated. -- Jonathan

On Sun, Aug 26, 2018 at 06:19:36PM +0100, Jonathan Fine wrote:
Lists are mutable and can be modified in place. Tuples are immutable and cannot be. By the way: it's not reliable to compare ID numbers for objects which don't necessarily exist similtaneously. The Python interpreter is permitted to re-use ID numbers. For example: py> s = [1, 4] py> id(s) 3080212620 py> del s py> s = [-1, 3] py> id(s) 3080212620 The only reliable use of ID numbers is to compare the ID of objects which are known to still exist. -- Steve

Hi James and Steve Myself and Steve wrote:
Lists are mutable and can be modified in place. Tuples are immutable and cannot be.
This correctly answers why this happens. Steve: I wanted James to think about this question. He learns more that way. (I already knew the answer.) James: Combining tuple assignment with increment assignment in a single statement will increase the cognitive burden on both writer and reader of the line of code. In other words, in most cases
Now look at:
a, b += my_object.some_method(args)
It is simpler than:
You can say that here the extra lines increase the cognitive burden. And I think I'd agree with you. But I think there's a code-smell here. Better, I think, is to introduce a type that supports augmented assignment. For example
I hope this helps -- Jonathan

On Sun, Aug 26, 2018 at 07:50:49PM +0100, Jonathan Fine wrote:
How do you know James doesn't already know the answer too? If you meant this as a lesson for James, you should have said so. You shouldn't have claimed to have been surprised by it: "Finally, here's something that surprised me a little bit"
Each additional line we read has to be read as a separate operation. Moving things into a single operation can reduce the cognitive load, at least sometimes. Reducing the number of lines x += 1 y += 2 z += 3 down to one: x, y, z += 1, 2, 3 # Hypothetical syntax could also *decrease* the cognitive burden on the writer or reader if x, y, z form a logically connected group. Each additional line of code has its own cognitive load just by being an extra line to process. For a small number of simple targets, our brains can chunk the assignments into one conceptual operation: "assign x, y, z" instead of three: "assign x; then assign y; also assign z" Hence a reduced cognitive load on both reader and writer. I see no reason why this chunking can't also apply to augmented assignments: "increment x, y, z" but I do worry if it would only chunk effectively if the increment is the same: x, y, z += 1 # increment each of x, y, z by 1 but not if we have to specify each increment separately: x, y, z += 1, 1, 1 I don't have any proof of this concern. [...]
I would like to see Python increase its support for vectorized operations. Some years ago I tried adding vectorized support to the statistics module as an experiment, but I found that the performance hit was horrible, so I abandoned the experiment. YMMV. Julia includes syntax to automatically vectorize any operator or function: https://docs.julialang.org/en/v0.6.2/manual/functions/#man-vectorized-1 but that's moving away from the topic on hand. (Any responses to my vectorization comment, please start a new thread or at least change the subject line.) Back to the original topic... My gut feeling is that this suggested syntax would work well if there was a single value on the right hand side, so that: spam, eggs, cheese += foo is approximately equivalent to: # evaluate RHS only once _tmp = foo spam += _tmp eggs += _tmp cheese += _tmp but not so well if we add sequence unpacking on the RHS: spam, eggs, cheese += foo, bar, baz Without adding new syntax, I think we can only pick one set of semantics, not both: either a single RHS value, or multiple values. My intuition is that the first version (a single value on the RHS) is not only more useful, but also more readable, since it allows the reader to chunk the operation to "increment these three targets" while the second doesn't. That also leaves the door open in the future to adding a vectorized version of augmented assignment, if and when we add syntax for vectorizing operations a la Julia. So... a tentative +1 to allowing: spam, eggs += foo and an even more tentative -0 to: spam, eggs += foo, bar -- Steve

James Lu writes:
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
No. It's equivalent to <expr3> = <expr4> <expr2> = <expr3> <expr1> = <expr2> and the order matters because the <expr>s may have side effects. Not sure where the rest of your message was going; it mostly just seemed to repeat examples from earlier posts? Steve

[James Lu]
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
[Stephen J. Turnbull[
This is tricky stuff. In fact the rightmost expression is evaluated once, and then the bindings are done left-to-right using the result of evaluating the rightmost expression. Like so:
So James's account is closer to what's done, but is missing weasel words to make clear that <expr4> is evaluated only once. Sane code doesn't rely on "left to right", but may well rely on "evaluated only once".

Hi Johnathan I echo your points. Indeed, the PEP referenced to refers to a "tuple expression" in the grammatical and not the programmatic sense. Finally, here's something that surprised me a little bit
Notice that '+=' creates uses the same object when the object is a list, but creates a new object. This raises the question: Why and how does Python behave in this way? It's because lists are mutable are tuples are immutable. There's a dunder iadd method and a dunder add method. iadd magic methods, operating on the left hand side, return None and modify the object in-place. add magic methods return the result and don't modify the object it's called on. iadd is mutable add, whereas add is "return a copy with the result added"
tuple1 = tuple1.__add__(tuple2) list1.__iadd__(list2)
Does it IN PRACTICE bring sufficient benefits to users?
I found myself needing this when I was writing a monte-carlo simulation in python that required incrementing a tallying counter from a subroutine. Not sure where the rest of your message was going; it mostly just
seemed to repeat examples from earlier posts?
Yes, I just wanted to summarize the existing discussion. On Sun, Aug 26, 2018 at 1:52 PM Tim Peters <tim.peters@gmail.com> wrote:

James has suggested that Python be enhanced so that
Myself, James and Matthew wrote
Does it IN PRACTICE bring sufficient benefits to users?
Wouldn't a numpy array be very suited for this kind of task?
Perhaps, James, you might like to refactor your code so that
tally += simulation(args) does what you want.
As Matthew points out, you could use numpy.array. Or code your own class, by providing __add__ and __iadd__ methods.
-- Jonathan

вс, 26 авг. 2018 г. в 14:53, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp>:
What do you mean by "consistent with the normal behavior of Python sequences"? Personally, I do not see any consistency. Currently, in Python 3.7:
a = [ 1, 2, 3] b = ( 1, 2, 3)
But:
a += b [1, 2, 3, 1, 2, 3]
As for me, Python has several historical artifacts (warts) which can not be fixed by now. I'm sure that someone, on the contrary, finds them as advantages (and I'm not interested that someone started to convince me in the opposite). Among them are: reversed order of arguments in `enumerate`, the absence of literals for `set` and `frozenset` types, all `bytes` story in Python 3, `+` operator overload for sequences and others. But for this particular case, `+` operator overload for concatenation, I think Python can do better and have the opportunity. Especially, taking into account the switch to generators for builtins in Python 3 and that in 30-40% times, rough estimation, `+` operator for sequences is used for throw away concatenation with unnecessary usage of resources in some cases. I like the way this problem was addressed and solved in Coconut language <http://coconut-lang.org/> with `::` operator. Coconut uses the ` ::` operator for iterator chaining. Since Coconut is just a transpiler, it is the best they can do. I don't think that just wrapping sequences and iterators with chain is the right solution for Python and also it is not enough to introduce new operator into the language. I would like to see it in a generalized form, for example with the possibility to access items with slices and indices,.. .I understand that this is not so easy, but at least I think that this is the right direction for several reasons: it give a possibility for future pattern matching syntax/functionality in Python, it can be done lazily and, in my opinion, `+` operator overload for builtin sequence types is a historical design mistake. with kind regards, -gdg

Hi James Thank you for the simple example. It makes discussing your proposal much easier. I hope you don't mind, I'll modify it a little to make the semantics clearer, at least to me. Here it is. init = (10, 20) incr = (1, 2) (a, b) = init # Now, (a, b) == (10, 20) a, b += incr # Now (a, b) == (11, 22) However, I'm not sure your suggestion sits well with the following
(10, 20) + (1, 2) (10, 20, 1, 2)
By the way, I was surprised to find that this is valid
[a, b] = 1, 2
with best regards Jonathan

James Lu writes:
This isn't consistent with the normal behavior of Python sequences: $ python3.6
(1,2) + (3,4) (1, 2, 3, 4)
That is, "+" means something different for sequences. Furthermore, the thing on the LHS is tuple syntax. There's only one reasonable meaning to give to an ordinary assignment from a sequence on the RHS to a "tuple of variables". But in your expression, the thing on the LHS wants to be an object, which can only be a concrete tuple:
which is itself immutable, as are its integer components in your example (which is not a toy, as complex expressions on the RHS would mean you could only fit two, maybe three on a line, which doesn't save much vertical space). You can argue that the same would hold for ordinary assignment, but doesn't, and that's true. However, ordinary assignment is a name-binding operation, while augmented assignment is a mutation of the underlying object. This is problematic for lists:
I don't think the list syntax has any use different from the tuple syntax. The example of list syntax makes me uncomfortable, though. I wonder if there are traps with other mutable objects. Finally, when I look at
a, b += 1, 2
as a once and future C programmer, I expect a value of (0, 1, 2) to be printed, with a == 0, and b == 1 at this point. (This is really minor; we like to avoid confusion with carryovers from other languages, but it's not enought to kill something useful.) That said, none of the above is sufficient reason to reject a useful syntax addition (augmented assignment to a tuple is currently an error). But it seems to me that compared to assignment by sequence unrolling, this generalization isn't very expressive. AIUI, one of the motivations for unrolling sequence-to-sequence assignment in this way is to provide the obvious notation for permutations:
and generalizing
You can't do any of the above without the sequence unrolling feature, without adding temporary variables. With augmented assignments, you can save some vertical space. Is there something else I'm missing? Steve

Steve and Johnathan, it seems like we're on the same page. Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>? When there are a tuple or list of names on the left hand side (ex. `a, b` or `(a, b)` or `[a, b]`), unpack the right hand side into values and perform the augmented assignment on each name on the left hand side. Borrowing Johnathan's example: a = b = 0 incr = (1, 2) a, b += incr # now (a, b) == (1, 2) concat = (-1, 0) concat += incr # existing behavior: now concat == (1, 2, 3, 4) Like Steve said, you can save vertical space and avoid creating temporary variables: temp_a, temp_b = simulate(new_deck) tally_a += temp_a tally_b += temp_b tally_a, tally_b = simulate(new_deck )

Hi James Thank you for your message. I have some general, background comments. However, the question remains: Does your proposal sit well with the rest of Python? By the way, I've just found a nice article: http://treyhunner.com/2018/03/tuple-unpacking-improves-python-code-readabili... You wrote
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
It is very important, in this context, to know what we can assign to. We can assign to some expressions, but not others. Here are some examples
When assigned to, the expression "(a, b)" has a special meaning. Strictly speaking, we are not assigning to a tuple. Hence
(fn()[1], a.b) = (0, 1) # Valid syntax. NameError: name 'fn' is not defined
Finally, here's something that surprised me a little bit
Notice that '+=' creates uses the same object when the object is a list, but creates a new object. This raises the question: Why and how does Python behave in this way? Finally, thank you for your stimulating idea. I think to take it forward you have to deal with two questions: 1. Does the proposal sit well with the rest of Python? 2. Does it IN PRACTICE bring sufficient benefits to users? Please note that Python deservedly has a reputation for being a readable language. Some will argue that allowing
Somewhere, as I recall, I read that writing a = <EXP1> b = <EXP2> is preferable for reasons of clarity to a, b = <EXP1>, <EXP2> when the two assignments are unrelated. -- Jonathan

On Sun, Aug 26, 2018 at 06:19:36PM +0100, Jonathan Fine wrote:
Lists are mutable and can be modified in place. Tuples are immutable and cannot be. By the way: it's not reliable to compare ID numbers for objects which don't necessarily exist similtaneously. The Python interpreter is permitted to re-use ID numbers. For example: py> s = [1, 4] py> id(s) 3080212620 py> del s py> s = [-1, 3] py> id(s) 3080212620 The only reliable use of ID numbers is to compare the ID of objects which are known to still exist. -- Steve

Hi James and Steve Myself and Steve wrote:
Lists are mutable and can be modified in place. Tuples are immutable and cannot be.
This correctly answers why this happens. Steve: I wanted James to think about this question. He learns more that way. (I already knew the answer.) James: Combining tuple assignment with increment assignment in a single statement will increase the cognitive burden on both writer and reader of the line of code. In other words, in most cases
Now look at:
a, b += my_object.some_method(args)
It is simpler than:
You can say that here the extra lines increase the cognitive burden. And I think I'd agree with you. But I think there's a code-smell here. Better, I think, is to introduce a type that supports augmented assignment. For example
I hope this helps -- Jonathan

On Sun, Aug 26, 2018 at 07:50:49PM +0100, Jonathan Fine wrote:
How do you know James doesn't already know the answer too? If you meant this as a lesson for James, you should have said so. You shouldn't have claimed to have been surprised by it: "Finally, here's something that surprised me a little bit"
Each additional line we read has to be read as a separate operation. Moving things into a single operation can reduce the cognitive load, at least sometimes. Reducing the number of lines x += 1 y += 2 z += 3 down to one: x, y, z += 1, 2, 3 # Hypothetical syntax could also *decrease* the cognitive burden on the writer or reader if x, y, z form a logically connected group. Each additional line of code has its own cognitive load just by being an extra line to process. For a small number of simple targets, our brains can chunk the assignments into one conceptual operation: "assign x, y, z" instead of three: "assign x; then assign y; also assign z" Hence a reduced cognitive load on both reader and writer. I see no reason why this chunking can't also apply to augmented assignments: "increment x, y, z" but I do worry if it would only chunk effectively if the increment is the same: x, y, z += 1 # increment each of x, y, z by 1 but not if we have to specify each increment separately: x, y, z += 1, 1, 1 I don't have any proof of this concern. [...]
I would like to see Python increase its support for vectorized operations. Some years ago I tried adding vectorized support to the statistics module as an experiment, but I found that the performance hit was horrible, so I abandoned the experiment. YMMV. Julia includes syntax to automatically vectorize any operator or function: https://docs.julialang.org/en/v0.6.2/manual/functions/#man-vectorized-1 but that's moving away from the topic on hand. (Any responses to my vectorization comment, please start a new thread or at least change the subject line.) Back to the original topic... My gut feeling is that this suggested syntax would work well if there was a single value on the right hand side, so that: spam, eggs, cheese += foo is approximately equivalent to: # evaluate RHS only once _tmp = foo spam += _tmp eggs += _tmp cheese += _tmp but not so well if we add sequence unpacking on the RHS: spam, eggs, cheese += foo, bar, baz Without adding new syntax, I think we can only pick one set of semantics, not both: either a single RHS value, or multiple values. My intuition is that the first version (a single value on the RHS) is not only more useful, but also more readable, since it allows the reader to chunk the operation to "increment these three targets" while the second doesn't. That also leaves the door open in the future to adding a vectorized version of augmented assignment, if and when we add syntax for vectorizing operations a la Julia. So... a tentative +1 to allowing: spam, eggs += foo and an even more tentative -0 to: spam, eggs += foo, bar -- Steve

James Lu writes:
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
No. It's equivalent to <expr3> = <expr4> <expr2> = <expr3> <expr1> = <expr2> and the order matters because the <expr>s may have side effects. Not sure where the rest of your message was going; it mostly just seemed to repeat examples from earlier posts? Steve

[James Lu]
Currently, is <expr1> = <expr2> = <expr3> = <expr4> always equivalent to <expr1> = <expr4>; <expr2> = <expr4>; <expr3> = <expr4>?
[Stephen J. Turnbull[
This is tricky stuff. In fact the rightmost expression is evaluated once, and then the bindings are done left-to-right using the result of evaluating the rightmost expression. Like so:
So James's account is closer to what's done, but is missing weasel words to make clear that <expr4> is evaluated only once. Sane code doesn't rely on "left to right", but may well rely on "evaluated only once".

Hi Johnathan I echo your points. Indeed, the PEP referenced to refers to a "tuple expression" in the grammatical and not the programmatic sense. Finally, here's something that surprised me a little bit
Notice that '+=' creates uses the same object when the object is a list, but creates a new object. This raises the question: Why and how does Python behave in this way? It's because lists are mutable are tuples are immutable. There's a dunder iadd method and a dunder add method. iadd magic methods, operating on the left hand side, return None and modify the object in-place. add magic methods return the result and don't modify the object it's called on. iadd is mutable add, whereas add is "return a copy with the result added"
tuple1 = tuple1.__add__(tuple2) list1.__iadd__(list2)
Does it IN PRACTICE bring sufficient benefits to users?
I found myself needing this when I was writing a monte-carlo simulation in python that required incrementing a tallying counter from a subroutine. Not sure where the rest of your message was going; it mostly just
seemed to repeat examples from earlier posts?
Yes, I just wanted to summarize the existing discussion. On Sun, Aug 26, 2018 at 1:52 PM Tim Peters <tim.peters@gmail.com> wrote:

James has suggested that Python be enhanced so that
Myself, James and Matthew wrote
Does it IN PRACTICE bring sufficient benefits to users?
Wouldn't a numpy array be very suited for this kind of task?
Perhaps, James, you might like to refactor your code so that
tally += simulation(args) does what you want.
As Matthew points out, you could use numpy.array. Or code your own class, by providing __add__ and __iadd__ methods.
-- Jonathan

вс, 26 авг. 2018 г. в 14:53, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp>:
What do you mean by "consistent with the normal behavior of Python sequences"? Personally, I do not see any consistency. Currently, in Python 3.7:
a = [ 1, 2, 3] b = ( 1, 2, 3)
But:
a += b [1, 2, 3, 1, 2, 3]
As for me, Python has several historical artifacts (warts) which can not be fixed by now. I'm sure that someone, on the contrary, finds them as advantages (and I'm not interested that someone started to convince me in the opposite). Among them are: reversed order of arguments in `enumerate`, the absence of literals for `set` and `frozenset` types, all `bytes` story in Python 3, `+` operator overload for sequences and others. But for this particular case, `+` operator overload for concatenation, I think Python can do better and have the opportunity. Especially, taking into account the switch to generators for builtins in Python 3 and that in 30-40% times, rough estimation, `+` operator for sequences is used for throw away concatenation with unnecessary usage of resources in some cases. I like the way this problem was addressed and solved in Coconut language <http://coconut-lang.org/> with `::` operator. Coconut uses the ` ::` operator for iterator chaining. Since Coconut is just a transpiler, it is the best they can do. I don't think that just wrapping sequences and iterators with chain is the right solution for Python and also it is not enough to introduce new operator into the language. I would like to see it in a generalized form, for example with the possibility to access items with slices and indices,.. .I understand that this is not so easy, but at least I think that this is the right direction for several reasons: it give a possibility for future pattern matching syntax/functionality in Python, it can be done lazily and, in my opinion, `+` operator overload for builtin sequence types is a historical design mistake. with kind regards, -gdg
participants (7)
-
James Lu
-
Jonathan Fine
-
Kirill Balunov
-
Matthew Einhorn
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Tim Peters