Clarification of unpacking semantics.
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`. I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell). The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all. Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context. Thanks! Brandt
On 2/6/2020 1:28 AM, Brandt Bucher wrote:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
A simpler example, which sharpens the contrast, is [*a, b]. The unpacking of *b is last either way. The change is from eval(a), eval(b), extend(a.__iter__()), append b to eval(a), extend(a.__iter__()), eval(b), append b
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264).
I carefully read and considered the original issue and the discussion on the PR and agree with the intent of the PR. The semantic change can have visible effects due to interaction of side-effects. Examples on the PR are 1. a.__iter__ raising while b prints ([*None, print('executed)]), and 2. a and b both involving the same iterator (*it, next(it)) These previously unannounced semantic changes are apparently gratuitous side-effects of an internal refactoring. They should only be made, if at all, after discussion and agreement, then announcement and a deprecation period. But I seen no reason to change the status quo semantics.
My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
I agree that '*a' is not an expression in the meaning relevant here. https://docs.python.org/3/glossary.html says "A piece of syntax which can be evaluated to some value." This is the common math/logic/CS meaning. '*a' cannot be evaluated to a Python object. It is not an 'expression statement and cannot be passed to eval(). In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression. starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression] '*a' is a 'starred_item' but not an 'expression'.
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
-- Terry Jan Reedy
On 06/02/2020 6:30 pm, Terry Reedy wrote:
On 2/6/2020 1:28 AM, Brandt Bucher wrote:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
A simpler example, which sharpens the contrast, is [*a, b]. The unpacking of *b is last either way. The change is from eval(a), eval(b), extend(a.__iter__()), append b to eval(a), extend(a.__iter__()), eval(b), append b
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264).
I carefully read and considered the original issue and the discussion on the PR and agree with the intent of the PR.
The semantic change can have visible effects due to interaction of side-effects. Examples on the PR are 1. a.__iter__ raising while b prints ([*None, print('executed)]), and 2. a and b both involving the same iterator (*it, next(it))
These previously unannounced semantic changes are apparently gratuitous side-effects of an internal refactoring. They should only be made, if at all, after discussion and agreement, then announcement and a deprecation period. But I seen no reason to change the status quo semantics.
These changes were unannounced because I didn't realize the current implementation was broken. There were no tests for raising an exception in the middle of unpacking a list, and it didn't occur to me to add them.
My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
I agree that '*a' is not an expression in the meaning relevant here. https://docs.python.org/3/glossary.html says "A piece of syntax which can be evaluated to some value." This is the common math/logic/CS meaning. '*a' cannot be evaluated to a Python object. It is not an 'expression statement and cannot be passed to eval().
In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression.
starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression]
'*a' is a 'starred_item' but not an 'expression'.
I don't know where you got that grammar from, but not GitHub https://github.com/python/cpython/blob/master/Grammar/Grammar#L142
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
On 2/6/2020 2:26 PM, Mark Shannon wrote:
In the python grammar, an 'expression' is a 'starred_item' but a 'starred_item' need not be an expression.
starred_item ::= expression | "*" or_expr expression ::= conditional_expression | lambda_expr conditional_expression ::= or_test ["if" or_test "else" expression]
'*a' is a 'starred_item' but not an 'expression'.
I don't know where you got that grammar from, but not GitHub https://github.com/python/cpython/blob/master/Grammar/Grammar#L142
From the human readable docs https://docs.python.org/3/reference/expressions.html#expression-lists https://docs.python.org/3/reference/expressions.html#conditional-expressions
Hi everyone, I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order. My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise. The question is, what should
[print("a"), *None, print("b")]
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell). The lack of explicitly listed precedence for an operator does not mean
print before raising an exception? I think just "a", Brandt thinks "a" and "b". Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed. There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652 On 06/02/2020 6:28 am, Brandt Bucher wrote: that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
There are many layers of grammar that make up a call. It is entirely arbitrary what you call an expression or some other grammatical entity. `*expr4` is parsed as an argument, the same as `expr2`. Cheers, Mark.
Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
Thanks!
Brandt _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4HS2ZEQW... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, 6 Feb 2020 at 20:17, Mark Shannon <mark@hotpy.org> wrote:
I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order.
My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise.
The question is, what should
[print("a"), *None, print("b")]
print before raising an exception? I think just "a", Brandt thinks "a" and "b".
Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed.
I think that if this were a new feature, either order would be arguable. But given that this code previously printed "a b", I think that changing the order is a change in user-visible behaviour and therefore the existing behaviour should be preserved (certainly in bugfix releases).
There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652
Changing the order as part of a feature release, listing it as a user-visible behaviour change in "What's New" seems acceptable to me. But characterising this change as a bug fix doesn't.
On 06/02/2020 6:28 am, Brandt Bucher wrote:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
I agree that it's the *change in behaviour* that's the issue here. We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.
The lack of explicitly listed precedence for an operator does not mean that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.
I don't think this is a particularly helpful way of looking at it. I don't think there's any particular need here to try to argue that either behaviour is "right" or "wrong". The previous behaviour has been round for some time, and the change in behaviour was (by your own admission) inadvertent. Therefore it seems obvious to me that the reasonable thing to do is to apply Brandt's PR, that restores the old evaluation order (with the *intended* fix from your patch intact, as I understand it). If, once this has been done, you still care strongly enough to argue for a behaviour change, targeted at 3.9 (assuming no-one insists on a deprecation period for the change!), then that's fine. Personally I think the arguments either way are weak, and I'd be inclined not to care, or to mildly prefer not bothering, in such a debate - but let's have the debate once the pressure of "is it OK to do this in a bugfix release?" has been removed. Paul
Then there’s nothing to do here right? Or just add it to whatsnew? On Thu, Feb 6, 2020 at 13:20 Brandt Bucher <brandtbucher@gmail.com> wrote:
We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.
The commits which changed the behavior were bytecode/compiler changes that only went to master. I don't think they are present on any other branches. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GC3PGIGX... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
I agree that if this is only in 3.9 then this is a cleanup of semantics that were a bit off and should stay but get a mention in What's New.
I like Mark’s new semantics better, but agree with the point about this being a “feature”. On Thu, Feb 6, 2020 at 13:06 Paul Moore <p.f.moore@gmail.com> wrote:
On Thu, 6 Feb 2020 at 20:17, Mark Shannon <mark@hotpy.org> wrote:
I recently unintentionally changed the semantics of this expression `[print("a"), *None, print("b")]`. PEP 448 states that this should raise an exception, but does not specify evaluation order.
My implementation was based on the general principle that evaluation in Python is left to right unless specified otherwise.
The question is, what should
[print("a"), *None, print("b")]
print before raising an exception? I think just "a", Brandt thinks "a" and "b".
Brandt argues that I have introduced a bug. I think I have fixed one, admittedly one that I didn't previously realize existed.
I think that if this were a new feature, either order would be arguable. But given that this code previously printed "a b", I think that changing the order is a change in user-visible behaviour and therefore the existing behaviour should be preserved (certainly in bugfix releases).
There is a precedent for fixing evaluation order to be left to right: https://bugs.python.org/issue29652
Changing the order as part of a feature release, listing it as a user-visible behaviour change in "What's New" seems acceptable to me. But characterising this change as a bug fix doesn't.
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is
On 06/02/2020 6:28 am, Brandt Bucher wrote: that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
I agree that it's the *change in behaviour* that's the issue here. We should fix that (by reverting to 3.8.1 behaviour) before 3.8.2 gets released.
The lack of explicitly listed precedence for an operator does not mean that it isn't an operator, merely that it doesn't need precedence due to the grammar. For example the slice creation operator `x:y` in `a[x:y]` needs no precedence as it is constrained to only occur in indexing operations. Likewise the unpacking operation `*a` can only occur in certain expressions. That doesn't mean that is not an operation.
I don't think this is a particularly helpful way of looking at it. I don't think there's any particular need here to try to argue that either behaviour is "right" or "wrong". The previous behaviour has been round for some time, and the change in behaviour was (by your own admission) inadvertent. Therefore it seems obvious to me that the reasonable thing to do is to apply Brandt's PR, that restores the old evaluation order (with the *intended* fix from your patch intact, as I understand it). If, once this has been done, you still care strongly enough to argue for a behaviour change, targeted at 3.9 (assuming no-one insists on a deprecation period for the change!), then that's fine. Personally I think the arguments either way are weak, and I'd be inclined not to care, or to mildly prefer not bothering, in such a debate - but let's have the debate once the pressure of "is it OK to do this in a bugfix release?" has been removed.
Paul _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SUIATL7A... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
06.02.20 08:28, Brandt Bucher пише:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
I have two problems with this change. 1. It changes error messages.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int
In 3.8 you got the same error message.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int
I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent. 2. It introduces performance regression. In 3.8 the bytecode for `(*a, *b, *c)` was: 1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3 In master it is: 1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations.
How did we move from [*a,...] to print(*a,...)? They are quite different. On Thu, Feb 6, 2020 at 14:07 Serhiy Storchaka <storchaka@gmail.com> wrote:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated
06.02.20 08:28, Brandt Bucher пише: prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR
to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
The docs themselves seem to support this line of reasoning (
https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic
... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of
order of their suffixes: the top-level call expression that operates on them all.
Mark Shannon disagrees with me (I'll let him reply rather than attempt
to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
I have two problems with this change.
1. It changes error messages.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int
In 3.8 you got the same error message.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int
I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent.
2. It introduces performance regression.
In 3.8 the bytecode for `(*a, *b, *c)` was:
1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3
In master it is:
1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE
The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CZZKWFW2... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Thu, 6 Feb 2020 at 23:14, Guido van Rossum <guido@python.org> wrote:
How did we move from [*a,...] to print(*a,...)? They are quite different.
It was a good way to demonstrate evaluation order, as an expression with a visible side effect. What the expression [print("a"), *None, print("b")] prints before the "cannot unpack NoneType" exception, demonstrates what order the expressions were evaluated in. Paul
Sorry, ignore that - I see Serhiy used "print(*a)". Paul On Fri, 7 Feb 2020 at 08:10, Paul Moore <p.f.moore@gmail.com> wrote:
On Thu, 6 Feb 2020 at 23:14, Guido van Rossum <guido@python.org> wrote:
How did we move from [*a,...] to print(*a,...)? They are quite different.
It was a good way to demonstrate evaluation order, as an expression with a visible side effect. What the expression [print("a"), *None, print("b")] prints before the "cannot unpack NoneType" exception, demonstrates what order the expressions were evaluated in.
Paul
07.02.20 01:00, Guido van Rossum пише:
How did we move from [*a,...] to print(*a,...)? They are quite different.
They are quite similar. The code for `(*a, *b, *c)` is: 1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3 The code for `print(*a, *b, *c)` is: 1 0 LOAD_NAME 0 (print) 2 LOAD_NAME 1 (a) 4 LOAD_NAME 2 (b) 6 LOAD_NAME 3 (c) 8 BUILD_TUPLE_UNPACK_WITH_CALL 3 10 CALL_FUNCTION_EX 0 It is covered by PEP 448 [1]. * BUILD_TUPLE_UNPACK, BUILD_LIST_UNPACK, BUILD_SET_UNPACK and BUILD_MAP_UNPACK were used to unpack iterables or mappings in tuple, list, set and dict displays. * BUILD_TUPLE_UNPACK_WITH_CALL and BUILD_MAP_UNPACK_WITH_CALL were used when pass multiple var-positional and var-keyword arguments to a function. All of them except BUILD_TUPLE_UNPACK_WITH_CALL was added in issue2292 [2]. BUILD_TUPLE_UNPACK_WITH_CALL was added in issue28257 [3] to unify error messages. [1] https://www.python.org/dev/peps/pep-0448/ [2] https://bugs.python.org/issue2292 [3] https://bugs.python.org/issue28257
On 06/02/2020 9:56 pm, Serhiy Storchaka wrote:
06.02.20 08:28, Brandt Bucher пише:
Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
In the following lines, expressions will be evaluated in the arithmetic order of their suffixes: ... expr1(expr2, expr3, *expr4, **expr5)
Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.
I have two problems with this change.
1. It changes error messages.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Value after * must be an iterable, not int
In 3.8 you got the same error message.
print(*1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int print(*1, *2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: print() argument after * must be an iterable, not int
I am not sure whether the function name is a useful information, but some effort was spend to preserve it. In any case, error messages should be consistent.
Including the function name in the error message is misleading. "TypeError: print() argument after * must be an iterable, not int" implies that the error is related to `print`. It is not; the error is entirely on the caller's side. The object being called is irrelevant. In 3.8:
1(*1) File "<stdin>", line 1, in <module> TypeError: int object argument after * must be an iterable, not int
"int object argument" is nonsense.
2. It introduces performance regression.
In 3.8 the bytecode for `(*a, *b, *c)` was:
1 0 LOAD_NAME 0 (a) 2 LOAD_NAME 1 (b) 4 LOAD_NAME 2 (c) 6 BUILD_TUPLE_UNPACK 3
In master it is:
1 0 BUILD_LIST 0 2 LOAD_NAME 0 (a) 4 LIST_EXTEND 1 6 LOAD_NAME 1 (b) 8 LIST_EXTEND 1 10 LOAD_NAME 2 (c) 12 LIST_EXTEND 1 14 LIST_TO_TUPLE
The bytecode is larger, therefore slower. It also prevents possible optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case of tuples and lists which would allow to minimize the number of memory allocations.
That's just not true. Larger bytecode is not necessarily slower, in fact if the operations are more efficient, it can easily be faster. Please don't waste your efforts "optimizing" rare bytecodes like "BUILD_TUPLE_UNPACK". It just makes the interpreter bigger, and has no effect on speed because they are rarely executed. In the compiled standard library, the most common unpacking bytecode `LIST_EXTEND` represents less than 1/1000 of the (static) total (557 out of 669k). Cheers, Mark.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CZZKWFW2...
Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Feb 7, 2020 at 1:48 AM Mark Shannon <mark@hotpy.org> wrote:
Including the function name in the error message is misleading.
"TypeError: print() argument after * must be an iterable, not int" implies that the error is related to `print`. It is not; the error is entirely on the caller's side. The object being called is irrelevant.
That's true, but the function name may help the user find the right call in the code if there are multiple calls on the line. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
participants (7)
-
Brandt Bucher
-
Brett Cannon
-
Guido van Rossum
-
Mark Shannon
-
Paul Moore
-
Serhiy Storchaka
-
Terry Reedy