[Python-ideas] Fwd: Fwd: unpacking generalisations for list comprehension
Steven D'Aprano
steve at pearwood.info
Fri Oct 14 22:38:08 EDT 2016
On Thu, Oct 13, 2016 at 05:30:49PM -0400, Random832 wrote:
> Frankly, I don't see why the pattern isn't obvious
*shrug*
Maybe your inability to look past your assumptions and see things from
other people's perspective is just as much a blind spot as our inability
to see why you think the pattern is obvious. We're *all* having
difficulty in seeing things from the other side's perspective here.
Let me put it this way: as far as I am concerned, sequence unpacking is
equivalent to manually replacing the sequence with its items:
t = (1, 2, 3)
[100, 200, *t, 300]
is equivalent to replacing "*t" with "1, 2, 3", which gives us:
[100, 200, 1, 2, 3, 300]
That's nice, simple, it makes sense, and it works in sufficiently recent
Python versions. It applies to function calls and assignments:
func(100, 200, *t) # like func(100, 200, 1, 2, 3)
a, b, c, d, e = 100, 200, *t # like a, b, c, d, e = 100, 200, 1, 2, 3
although it doesn't apply when the star is on the left hand side:
a, b, *x, e = 1, 2, 3, 4, 5, 6, 7
That requires a different model for starred names, but *that* model is
similar to its role in function parameters: def f(*args). But I digress.
Now let's apply that same model of "starred expression == expand the
sequence in place" to a list comp:
iterable = [t]
[*t for t in iterable]
If you do the same manual replacement, you get:
[1, 2, 3 for t in iterable]
which isn't legal since it looks like a list display [1, 2, ...]
containing invalid syntax. The only way to have this make sense is to
use parentheses:
[(1, 2, 3) for t in iterable]
which turns [*t for t in iterable] into a no-op.
Why should the OP's complicated, hard to understand (to many of us)
interpretation take precedence over the simple, obvious, easy to
understand model of sequence unpacking that I describe here?
That's not a rhetorical question. If you have a good answer, please
share it. But I strongly believe that on the evidence of this thread,
[a, b, *t, d]
is easy to explain, teach and understand, while:
[*t for t in iterable]
will be confusing, hard to teach and understand except as "magic syntax"
-- it works because the interpreter says it works, not because it
follows from the rules of sequence unpacking or comprehensions. It might
as well be spelled:
[ MAGIC!!!! HAPPENS!!!! HERE!!!! t for t in iterable]
except it is shorter.
Of course, ultimately all syntax is "magic", it all needs to be learned.
There's nothing about + that inherently means plus. But we should
strongly prefer to avoid overloading the same symbol with distinct
meanings, and * is one of the most heavily overloaded symbols in Python:
- multiplication and exponentiation
- wildcard imports
- globs, regexes
- collect arguments and kwargs
- sequence unpacking
- collect unused elements from a sequence
and maybe more. This will add yet another special meaning:
- expand the comprehension ("extend instead of append").
If we're going to get this (possibly useful?) functionality, I'd rather
see an explicit flatten() builtin, or see it spelled:
[from t for t in sequence]
which at least is *obviously* something magical, than yet another magic
meaning to the star operator. Its easy to look it up in the docs or
google for it, and doesn't look like Perlish line noise.
--
Steve
More information about the Python-ideas
mailing list