Some background. PEP 3132 (
https://peps.python.org/pep-3132/) lists the following:
Possible changes discussed were:
- Only allow a starred expression as the last item in the exprlist. This would simplify the unpacking code a bit and allow for the starred expression to be assigned an iterator. This behavior was rejected because it would be too surprising.
The important use case in Python for the proposed semantics is when
you have a variable-length record, the first few items of which are
interesting, and the rest of which is less so, but not unimportant.
(If you wanted to throw the rest away, you'd just write a, b, c =
x[:3] instead of a, b, c, *d = x.)
There was also discussion about retaining the type of the object on the RHS, e.g.:
c, *rest = "chair" # c="r", rest="hair"
it = iter(range(10))
x, *rest = it # c=1, rest is a reference to `it`
That proposal was rejected because the types were too confusing, e.g.:
header, *lines = open("some-file", "r") # lines is an iterator
header, *lines, footer = open("some-file", "r") # lines is a list
From an implementation POV, if you have an unknown object on
the RHS, you have to try slicing it before you try iterating over it;
this may cause problems e.g. if the object happens to be a defaultdict
-- since x[3:] is implemented as x[slice(None, 3, None)], the
defaultdict will give you its default value. I'd much rather define
this in terms of iterating over the object until it is exhausted,
which can be optimized for certain known types like lists and tuples.
It seems like these objections don't apply in this case, if we define a syntax that explicitly says not to assign anything. There is no inconsistency in the types there. E.g. in the proposal here:
header, *... = open("some-file", "r")
header, *..., footer = open("some-file", "r")
It's clear that to compute what the footer is, you would need to iterate over the whole file, whereas you don't in the first one.
So historically, the idea here was discussed and rejected, but for a reason which does not apply in this case.
=======
Regarding utility, there are many sort of ugly ways of doing this with method calls, especially from itertools. I tend to like syntax over methods for handling basic data types. This is partly because it's more readable: almost any method which takes more than one positional argument introduces cognitive load because you have to remember what the order of the arguments are and what they mean. You can add keyword arguments to improve readability, but then it's more characters and you have to remember the name or have it autocompleted. So if there is a simple way to support a use case with simple built-in syntax, it can improve the utility of the language.
Like honestly, does anyone remember the arguments to `islice`? I'm fairly sure I've had to look it up every single time I've ever used it. For iterator-heavy code, this might be multiple times on the same day. For the `next(iterator, [default], count=1)` proposal, it's very easy to write incorrect code that might look correct, e.g. `next(iterator, 3)`. Does 3 refer to the count or the default? If you've written python for years, it's clear, but less clear to a novice.
There are efficiency arguments too: method calls are expensive, whereas bytecode calls can be much more optimized. If you're already using iterators, efficiency is probably relevant:
>>> import dis
>>> from itertools import islice
>>> def first_two_islice(it):
... return tuple(islice(it, 2))
...
>>> def first_two_destructuring(it):
... x, y, *rest = it
... return x, y
...
>>> dis.dis(first_two_islice)
2 0 LOAD_GLOBAL 0 (tuple)
2 LOAD_GLOBAL 1 (islice)
4 LOAD_FAST 0 (it)
6 LOAD_CONST 1 (2)
8 CALL_FUNCTION 2
10 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis(first_two_destructuring)
2 0 LOAD_FAST 0 (it)
2 UNPACK_EX 2
4 STORE_FAST 1 (x)
6 STORE_FAST 2 (y)
8 STORE_FAST 3 (rest)
3 10 LOAD_FAST 1 (x)
12 LOAD_FAST 2 (y)
14 BUILD_TUPLE 2
16 RETURN_VALUE
The latter requires no expensive CALL_FUNCTION operations, though it does currently allocate rest pointlessly.
Personally, I think the main use case would be for handling large lists in a memory efficient and readable manner. Currently using *_ means you have to balance readability against performance. Why is there that tradeoff? Does it serve literally any purpose? I think about this /every single time/ I use *object destructuring if I don't care about the *thing. But I don't want to think about how big *thing is: the language is forcing me to assign it a name and allocate memory for it. It would be a minor improvement to easily write an expression that is similarly readable, but does not have the performance penalty. The performance penalty will be minor in most cases, but you still have to think about whether it's minor or not, which is a cost of the existing syntax.
===
It seems like a lot of the arguments against this syntax would apply equally well to existing syntax. If using indexing, next, islice, etc. was good enough, why were PEPs like 3132, 448 or 636 approved? This proposal seems like a pretty natural extension of a trend in the last several versions of python to make these sorts of expressions more and more expressive. It's polishing a minor rough place in a syntax that's been developing for years, which seems like a good idea regardless of whether somewhat usable alternatives exist in the standard library.