[Python-ideas] Fwd: Fwd: unpacking generalisations for list comprehension
Steven D'Aprano
steve at pearwood.info
Sat Oct 15 06:38:17 EDT 2016
On Sat, Oct 15, 2016 at 04:42:13AM -0400, Random832 wrote:
> On Sat, Oct 15, 2016, at 04:00, Steven D'Aprano wrote:
> > > This is unpacking. It unpacks the results into the destination.
> >
> > If it were unpacking as it is understood today, with no other changes,
> > it would be a no-op. (To be technical, it would convert whatever
> > iterable t is into a tuple.)
>
> If that were true, it would be a no-op everywhere.
That's clearly not the case.
x = (1, 2, 3)
[100, 200, *x, 300]
If you do it the way I say, and replace *x with the individual items of
x, you get this:
[100, 200, 1, 2, 3, 300]
which conveniently happens to be what you already get in Python. You
claim that if I were write it should be a no-op -- that doesn't follow.
Why would it be a no-op? I've repeatedly shown the transformation to
use, and it clearly does what I say it should. How could it not?
> > I've covered that in an earlier post: if
> > you replace *t with the actual items of t, you DON'T get:
>
> Replacing it _with the items_ is not the same thing as replacing it
> _with a sequence containing the items_,
I don't think I ever used the phrasing "a sequence containing the
items". I think that's *your* phrase, not mine.
I may have said "with the sequence of items" or words to that effect.
These two phrases do have different meanings:
x = (1, 2, 3)
[100, 200, *x, 300]
# Replace *x with "a sequence containing items of x"
[100, 200, [1, 2, 3], 300]
# Replace *x with "the sequence of items of x"
[100, 200, 1, 2, 3, 300]
Clearly they have different meanings. I'm confident that I've always
made it clear that I'm referring to the second, not the first, but I'm
only human and if I've been unclear or used the wrong phrasing, my
apologies.
But nit-picking about the exact phrasing used aside, it is clear that
expanding the *t in a list comprehension:
[*t for t in iterable]
to flatten the iterable cannot be analogous to this. Whatever
explanation you give for why *t expands the list comprehension, it
cannot be given in terms of replacing *t with the items of t. There has
to be some magic to give it the desired special behaviour.
> and you're trying to pull a fast
> one by claiming it is by using the fact that the "equivalent loop"
> (which is and has always been a mere fiction, not a real transformation
> actually performed by the interpreter) happens to use a sequence of
> tokens that would cause a tuple to be created if a comma appears in the
> relevant position.
I don't know what "equivalent loop" you are accusing me of misusing.
The core developers have made it absolutely clear that changing the
fundamental equivalence of list comps as syntactic sugar for:
result = []
for t in iterable:
result.append(t)
is NOT NEGOTIABLE. (That is much to my disappointment -- I would love to
introduce a "while" version of list comps to match the "if" version, but
that's not an option.)
So regardless of whether it is a fiction or an absolute loop, Python's
list comprehensions are categorically limited to behave equivalently to
the loop above (modulo scope, temporary variables, etc). If you want to
change that -- change the append to an extend, for example -- you need
to make a case for that change which is strong enough to overcome
Guido's ruling. (Maybe Guido will be willing to bend his ruling to allow
extend as well.)
There are three ways to get the desired flatten() behaviour from
a list comp. One way is to explicitly add a second loop, which has the
benefit of already working:
[x for t in iterable for x in t]
Another is to swap out the append for an extend:
[*t for t in iterable]
# real or virtual transformation, it doesn't matter
result = []
for t in iterable:
result.extend(t)
And the third is to keep the append but insert an extra virtual loop:
# real or virtual transformation, it still doesn't matter
result = []
for t in iterable:
for x in t:
result.append(x)
Neither of the second or third suggestions match the equivalent loop
form given above. Neither the second nor third is an obvious extension
of the way sequence unpacking works in other contexts.
[...]
> Imagine that we were talking about ordinary list displays, and for some
> reason had developed a tradition of explaining them in terms of
> "equivalent" code the way we do for comprehensions.
>
> x = [a, b, c] is equivalent to:
> x = list()
> x.append(a)
> x.append(b)
> x.append(c)
>
> So now if we replace c with *c [where c == [d, e]], must we now say
> this?
> x = list()
> x.append(a)
> x.append(b)
> x.append(d, e)
>
> Well, that's just not valid at all.
Right. And if we had a tradition of saying that list displays MUST be
equivalent to the unpacked sequence of appends, then sequence unpacking
inside a list display would be prohibited. But we have no such
tradition, and sequence unpacking inside the list really is an obvious
straight line extrapolation from (say) sequence unpacking inside a
function call.
Fortunately, we have a *different* tradition when it comes to list
displays, and no ruling that *c must turn into append with multiple
arguments. Our explanation of [a, b, *c] occurs at an earlier level:
replace the *c with the items of c:
c = [d, e]
[a, b, *c] ==> [a, b, d, e]
And there is no problem.
Unfortuantely for you, none of this is the case for list comps. We DO
have a tradition and a BDFL ruling that list comps are strictly
equivalent to a loop with append.
And the transformation of *t for the items of t (I don't care if it is a
real transformation in the implementation, or only a fictional
transformation) cannot work in a list comp. Let's make the number of
items of t explicit so we don't have to worry about variable item
counts:
[*t for t in iterable] # t has three items
[a, b, c for (a, b, c) in iterable]
That's a syntax error. To avoid the syntax error, we need parentheses:
[(a, b, c) for (a, b, c) in iterable]
and that's a no-op. So we're back to my first response to this thread:
why on earth would you expect *t in a list comprehension to flatten the
iterable? It should be either an error, or a no-op.
> Clearly we must reject this
> ridiculous notion of allowing starred expressions within list displays,
> because we _can't possibly_ change the transformation to accommodate the
> new feature.
Of course we can. I've repeatedly said we can do anything we want. If we
want, we can have *t in a list comprehension be sugar for importing the
sys module, or erasing your home directory. What we can't say is that
"erasing your home directory" is an obvious straight-line extrapolation
from existing uses of the star operator. There's nothing obvious here:
this thread is proof that whatever connection (if any) between the two
is non-obvious, twisted, even strange and bizarre.
I have never argued against this suggested functionality: flattening
iterables is obviously a useful thing to do. But:
- we can already use a list comp to flatten:
[x for t in iterable for x in t]
- there's no obvious or clear connection between the *t in the suggested
syntax and existing uses of the star operator; it might as well be
spelled [magic!!!! t for t in iterable] for all the relevance sequence
unpacking has;
- if anyone can explain the connection they see, I'm listening;
(believe me, I am *trying to understand* -- but none of the given
explanations for a connection hold up as far as I am concerned)
- even if we agree that there is a connection, this thread is
categorical proof that it is not obvious: it has taken DOZENS of
emails to (allegedly) get the message across;
- if we do get syntactic sugar for flatten(), why does it have to
overload the star operator for yet another meaning?
Hence my earlier questions: do we really need this, and if so, does it
have to be spelled *t? Neither of those questions are obviously answered
with a "Yes".
--
Steve
More information about the Python-ideas
mailing list