Mailman 3 Clarification of unpacking semantics. - Python-Dev

Clarification of unpacking semantics.

Brandt Bucher

5 Feb 2020 5 Feb '20

10:28 p.m.

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`. I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell). The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...

In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all. Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context. Thanks! Brandt

Show replies by thread

Terry Reedy

6 Feb 6 Feb

10:30 a.m.

On 2/6/2020 1:28 AM, Brandt Bucher wrote:

...

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

A simpler example, which sharpens the contrast, is [*a, b]. The unpacking of *b is last either way. The change is from eval(a), eval(b), extend(a.__iter__()), append b to eval(a), extend(a.__iter__()), eval(b), append b

...

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264).

I carefully read and considered the original issue and the discussion on the PR and agree with the intent of the PR. The semantic change can have visible effects due to interaction of side-effects. Examples on the PR are 1. a.__iter__ raising while b prints ([*None, print('executed)]), and 2. a and b both involving the same iterator (*it, next(it)) These previously unannounced semantic changes are apparently gratuitous side-effects of an internal refactoring. They should only be made, if at all, after discussion and agreement, then announcement and a deprecation period. But I seen no reason to change the status quo semantics.

...

My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

I agree that '*a' is not an expression in the meaning relevant here. https://docs.python.org/3/glossary.html says "A piece of syntax which can be evaluated to some value." This is the common math/logic/CS meaning. '*a' cannot be evaluated to a Python object. It is not an 'expression statement and cannot be passed to eval(). In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression. starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression] '*a' is a 'starred_item' but not an 'expression'.

...

The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

-- Terry Jan Reedy

Mark Shannon

11:26 a.m.

On 06/02/2020 6:30 pm, Terry Reedy wrote:

...

On 2/6/2020 1:28 AM, Brandt Bucher wrote:

...
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

A simpler example, which sharpens the contrast, is [*a, b]. The unpacking of *b is last either way. The change is from eval(a), eval(b), extend(a.__iter__()), append b to eval(a), extend(a.__iter__()), eval(b), append b

...
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264).

I carefully read and considered the original issue and the discussion on the PR and agree with the intent of the PR.

The semantic change can have visible effects due to interaction of side-effects. Examples on the PR are 1. a.__iter__ raising while b prints ([*None, print('executed)]), and 2. a and b both involving the same iterator (*it, next(it))

These previously unannounced semantic changes are apparently gratuitous side-effects of an internal refactoring. They should only be made, if at all, after discussion and agreement, then announcement and a deprecation period. But I seen no reason to change the status quo semantics.

These changes were unannounced because I didn't realize the current implementation was broken. There were no tests for raising an exception in the middle of unpacking a list, and it didn't occur to me to add them.

...

...
My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

I agree that '*a' is not an expression in the meaning relevant here. https://docs.python.org/3/glossary.html says "A piece of syntax which can be evaluated to some value." This is the common math/logic/CS meaning. '*a' cannot be evaluated to a Python object. It is not an 'expression statement and cannot be passed to eval().

In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression.

starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression]

'*a' is a 'starred_item' but not an 'expression'.

I don't know where you got that grammar from, but not GitHub https://github.com/python/cpython/blob/master/Grammar/Grammar#L142

...

...
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

Terry Reedy

12:03 p.m.

On 2/6/2020 2:26 PM, Mark Shannon wrote:

...

...
In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression.

starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression]

'*a' is a 'starred_item' but not an 'expression'.

I don't know where you got that grammar from, but not GitHub https://github.com/python/cpython/blob/master/Grammar/Grammar#L142

From the human readable docs https://docs.python.org/3/reference/expressions.html#expression-lists https://docs.python.org/3/reference/expressions.html#conditional-expressions

Mark Shannon

12:11 p.m.

Hi everyone, I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order. My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise. The question is, what should

...

...
...
[print("a"), *None, print("b")]

...

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell). The lack of explicitly listed precedence for an operator does not mean

print before raising an exception? I think just "a", Brandt thinks "a" and "b". Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed. There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652 On 06/02/2020 6:28 am, Brandt Bucher wrote: that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.

...

The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

There are many layers of grammar that make up a call. It is entirely arbitrary what you call an expression or some other grammatical entity. `*expr4` is parsed as an argument, the same as `expr2`. Cheers, Mark.

...

Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

Thanks!

Brandt _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4HS2ZEQW... Code of Conduct: http://python.org/psf/codeofconduct/

Paul Moore

12:55 p.m.

On Thu, 6 Feb 2020 at 20:17, Mark Shannon wrote:

...

I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order.

My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise.

The question is, what should

...
...
...
[print("a"), *None, print("b")]

print before raising an exception? I think just "a", Brandt thinks "a" and "b".

Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed.

I think that if this were a new feature, either order would be arguable. But given that this code previously printed "a b", I think that changing the order is a change in user-visible behaviour and therefore the existing behaviour should be preserved (certainly in bugfix releases).

...

There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652

Changing the order as part of a feature release, listing it as a user-visible behaviour change in "What's New" seems acceptable to me. But characterising this change as a bug fix doesn't.

...

On 06/02/2020 6:28 am, Brandt Bucher wrote:

...
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

I agree that it's the *change in behaviour* that's the issue here. We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.

...

The lack of explicitly listed precedence for an operator does not mean that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.

I don't think this is a particularly helpful way of looking at it. I don't think there's any particular need here to try to argue that either behaviour is "right" or "wrong". The previous behaviour has been round for some time, and the change in behaviour was (by your own admission) inadvertent. Therefore it seems obvious to me that the reasonable thing to do is to apply Brandt's PR, that restores the old evaluation order (with the *intended* fix from your patch intact, as I understand it). If, once this has been done, you still care strongly enough to argue for a behaviour change, targeted at 3.9 (assuming no-one insists on a deprecation period for the change!), then that's fine. Personally I think the arguments either way are weak, and I'd be inclined not to care, or to mildly prefer not bothering, in such a debate - but let's have the debate once the pressure of "is it OK to do this in a bugfix release?" has been removed. Paul

Brandt Bucher

1:11 p.m.

...

We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.

The commits which changed the behavior were bytecode/compiler changes that only went to master. I don't think they are present on any other branches.

Guido van Rossum

1:11 p.m.

I like Mark’s new semantics better, but agree with the point about this being a “feature”. On Thu, Feb 6, 2020 at 13:06 Paul Moore wrote:

...

On Thu, 6 Feb 2020 at 20:17, Mark Shannon wrote:

...
I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order.

My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise.

The question is, what should

...
...
...
[print("a"), *None, print("b")]

print before raising an exception? I think just "a", Brandt thinks "a" and "b".

Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed.

I think that if this were a new feature, either order would be arguable. But given that this code previously printed "a b", I think that changing the order is a change in user-visible behaviour and therefore the existing behaviour should be preserved (certainly in bugfix releases).

...
There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652

Changing the order as part of a feature release, listing it as a user-visible behaviour change in "What's New" seems acceptable to me. But characterising this change as a bug fix doesn't.

...
...
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is

On 06/02/2020 6:28 am, Brandt Bucher wrote: that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

I agree that it's the *change in behaviour* that's the issue here. We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.

...
The lack of explicitly listed precedence for an operator does not mean that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.

I don't think this is a particularly helpful way of looking at it. I don't think there's any particular need here to try to argue that either behaviour is "right" or "wrong". The previous behaviour has been round for some time, and the change in behaviour was (by your own admission) inadvertent. Therefore it seems obvious to me that the reasonable thing to do is to apply Brandt's PR, that restores the old evaluation order (with the *intended* fix from your patch intact, as I understand it). If, once this has been done, you still care strongly enough to argue for a behaviour change, targeted at 3.9 (assuming no-one insists on a deprecation period for the change!), then that's fine. Personally I think the arguments either way are weak, and I'd be inclined not to care, or to mildly prefer not bothering, in such a debate - but let's have the debate once the pressure of "is it OK to do this in a bugfix release?" has been removed.

Paul _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SUIATL7A... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido (mobile)

Guido van Rossum

1:46 p.m.

Then there’s nothing to do here right? Or just add it to whatsnew? On Thu, Feb 6, 2020 at 13:20 Brandt Bucher wrote:

...

...
We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.

The commits which changed the behavior were bytecode/compiler changes that only went to master. I don't think they are present on any other branches. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GC3PGIGX... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido (mobile)

Serhiy Storchaka

1:56 p.m.

06.02.20 08:28, Brandt Bucher пише:

...

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

I have two problems with this change. 1. It changes error messages.

...

...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int

In 3.8 you got the same error message.

...

...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int

I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent. 2. It introduces performance regression. In 3.8 the bytecode for `(*a, *b, *c)` was: 1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3 In master it is: 1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations.

Guido van Rossum

3 p.m.

How did we move from [*a,...] to print(*a,...)? They are quite different. On Thu, Feb 6, 2020 at 14:07 Serhiy Storchaka wrote:

...

...
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated

06.02.20 08:28, Brandt Bucher пише: prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

...
I believe this breaking semantic change is a bug, and I've opened a PR

to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

...
The docs themselves seem to support this line of reasoning (

https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
...
In the following lines, expressions will be evaluated in the arithmetic

...
...
... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of

order of their suffixes: the top-level call expression that operates on them all.

...
Mark Shannon disagrees with me (I'll let him reply rather than attempt

to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

I have two problems with this change.

1. It changes error messages.

...
...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int

In 3.8 you got the same error message.

...
...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int

I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent.

2. It introduces performance regression.

In 3.8 the bytecode for `(*a, *b, *c)` was:

1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3

In master it is:

1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE

The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CZZKWFW2... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido (mobile)

Paul Moore

7 Feb 7 Feb

12:10 a.m.

On Thu, 6 Feb 2020 at 23:14, Guido van Rossum wrote:

...

How did we move from [*a,...] to print(*a,...)? They are quite different.

It was a good way to demonstrate evaluation order, as an expression with a visible side effect. What the expression [print("a"), *None, print("b")] prints before the "cannot unpack NoneType" exception, demonstrates what order the expressions were evaluated in. Paul

Paul Moore

12:11 a.m.

Sorry, ignore that - I see Serhiy used "print(*a)". Paul On Fri, 7 Feb 2020 at 08:10, Paul Moore wrote:

...

On Thu, 6 Feb 2020 at 23:14, Guido van Rossum wrote:

...
How did we move from [*a,...] to print(*a,...)? They are quite different.

It was a good way to demonstrate evaluation order, as an expression with a visible side effect. What the expression [print("a"), *None, print("b")] prints before the "cannot unpack NoneType" exception, demonstrates what order the expressions were evaluated in.

Paul

Serhiy Storchaka

12:18 a.m.

07.02.20 01:00, Guido van Rossum пише:

...

How did we move from [*a,...] to print(*a,...)? They are quite different.

They are quite similar. The code for `(*a, *b, *c)` is: 1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3 The code for `print(*a, *b, *c)` is: 1 0 LOAD_NAME 0 (print) 2 LOAD_NAME 1 (a) 4 LOAD_NAME 2 (b) 6 LOAD_NAME 3 (c) 8 BUILD_TUPLE_UNPACK_WITH_CALL 3 10 CALL_FUNCTION_EX 0 It is covered by PEP 448 [1]. * BUILD_TUPLE_UNPACK, BUILD_LIST_UNPACK, BUILD_SET_UNPACK and BUILD_MAP_UNPACK were used to unpack iterables or mappings in tuple, list, set and dict displays. * BUILD_TUPLE_UNPACK_WITH_CALL and BUILD_MAP_UNPACK_WITH_CALL were used when pass multiple var-positional and var-keyword arguments to a function. All of them except BUILD_TUPLE_UNPACK_WITH_CALL was added in issue2292 [2]. BUILD_TUPLE_UNPACK_WITH_CALL was added in issue28257 [3] to unify error messages. [1] https://www.python.org/dev/peps/pep-0448/ [2] https://bugs.python.org/issue2292 [3] https://bugs.python.org/issue28257

Mark Shannon

1:37 a.m.

On 06/02/2020 9:56 pm, Serhiy Storchaka wrote:

...

06.02.20 08:28, Brandt Bucher пише:

...
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.

I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).

The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):

...
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

I have two problems with this change.

1. It changes error messages.

...
...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int

In 3.8 you got the same error message.

...
...
...
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int

I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent.

Including the function name in the error message is misleading. "TypeError: print() argument after * must be an iterable, not int" implies that the error is related to `print`. It is not; the error is entirely on the caller's side. The object being called is irrelevant. In 3.8:

...

...
...
1(*1) File "<stdin>", line 1, in <module> TypeError: int object argument after * must be an iterable, not int

"int object argument" is nonsense.

...

2. It introduces performance regression.

In 3.8 the bytecode for `(*a, *b, *c)` was:

1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3

In master it is:

1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE

The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations.

That's just not true. Larger bytecode is not necessarily slower, in fact if the operations are more efficient, it can easily be faster. Please don't waste your efforts "optimizing" rare bytecodes like "BUILD_TUPLE_UNPACK". It just makes the interpreter bigger, and has no effect on speed because they are rarely executed. In the compiled standard library, the most common unpacking bytecode `LIST_EXTEND` represents less than 1/1000 of the (static) total (557 out of 669k). Cheers, Mark.

...

_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CZZKWFW2...

Code of Conduct: http://python.org/psf/codeofconduct/

Guido van Rossum

10:11 a.m.

On Fri, Feb 7, 2020 at 1:48 AM Mark Shannon wrote:

...

Including the function name in the error message is misleading.

"TypeError: print() argument after * must be an iterable, not int" implies that the error is related to `print`. It is not; the error is entirely on the caller's side. The object being called is irrelevant.

That's true, but the function name may help the user find the right call in the code if there are multiple calls on the line. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...

Brett Cannon

3:25 p.m.

I agree that if this is only in 3.9 then this is a cleanup of semantics that were a bit off and should stay but get a mention in What's New.

1533

Age (days ago)

1534

Last active (days ago)

List overview

Download

16 comments

7 participants

participants (7)

Brandt Bucher
Brett Cannon
Guido van Rossum
Mark Shannon
Paul Moore
Serhiy Storchaka
Terry Reedy

Clarification of unpacking semantics.

tags

participants (7)