How did we move from [*a,...] to print(*a,...)? They are quite different.

On Thu, Feb 6, 2020 at 14:07 Serhiy Storchaka <storchaka@gmail.com> wrote:
06.02.20 08:28, Brandt Bucher пише:
> Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order `a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> `a.__iter__()` -> `b` -> `b.__iter__()`.
>
> I believe this breaking semantic change is a bug, and I've opened a PR to fix it (https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" isn't an operator; it doesn't appear on the operator precedence table in the docs, and you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction. And it has always been this way since PEP 448 (as far as I can tell).
>
> The docs themselves seem to support this line of reasoning (https://docs.python.org/3/reference/expressions.html#evaluation-order):
>
>> In the following lines, expressions will be evaluated in the arithmetic order of their suffixes:
>> ...
>> expr1(expr2, expr3, *expr4, **expr5)
>
> Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.
>
> Mark Shannon disagrees with me (I'll let him reply rather than attempt to summarize his argument for him), but we figured it might be better to get more input here on exactly whether you all think the behavior should change or not. You can see the discussion on the PR itself for some additional points and context.

I have two problems with this change.

1. It changes error messages.

 >>> print(*1)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int
 >>> print(*1, *2)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: Value after * must be an iterable, not int

In 3.8 you got the same error message.

 >>> print(*1)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int
 >>> print(*1, *2)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int

I am not sure whether the function name is a useful information, but
some effort was spend to preserve it. In any case, error messages should
be consistent.


2. It introduces performance regression.

In 3.8 the bytecode for `(*a, *b, *c)` was:

   1           0 LOAD_NAME                0 (a)
               2 LOAD_NAME                1 (b)
               4 LOAD_NAME                2 (c)
               6 BUILD_TUPLE_UNPACK       3

In master it is:

   1           0 BUILD_LIST               0
               2 LOAD_NAME                0 (a)
               4 LIST_EXTEND              1
               6 LOAD_NAME                1 (b)
               8 LIST_EXTEND              1
              10 LOAD_NAME                2 (c)
              12 LIST_EXTEND              1
              14 LIST_TO_TUPLE

The bytecode is larger, therefore slower. It also prevents possible
optimization of BUILD_TUPLE_UNPACK and similar opcodes for common case
of tuples and lists which would allow to minimize the number of memory
allocations.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CZZKWFW22TBJ5VLO7GUIF7A7QBFTBAC2/
Code of Conduct: http://python.org/psf/codeofconduct/
--
--Guido (mobile)