[Python-ideas] Fwd: Fwd: unpacking generalisations for list comprehension

Sat Oct 15 06:38:17 EDT 2016

On Sat, Oct 15, 2016 at 04:42:13AM -0400, Random832 wrote:
> On Sat, Oct 15, 2016, at 04:00, Steven D'Aprano wrote:
> > > This is unpacking. It unpacks the results into the destination.
> > 
> > If it were unpacking as it is understood today, with no other changes, 
> > it would be a no-op. (To be technical, it would convert whatever 
> > iterable t is into a tuple.) 
> 
> If that were true, it would be a no-op everywhere.

That's clearly not the case.

    x = (1, 2, 3)
    [100, 200, *x, 300]

If you do it the way I say, and replace *x with the individual items of 
x, you get this:

    [100, 200, 1, 2, 3, 300]

which conveniently happens to be what you already get in Python. You 
claim that if I were write it should be a no-op -- that doesn't follow. 
Why would it be a no-op? I've repeatedly shown the transformation to 
use, and it clearly does what I say it should. How could it not?

> > I've covered that in an earlier post: if 
> > you replace *t with the actual items of t, you DON'T get:
> 
> Replacing it _with the items_ is not the same thing as replacing it
> _with a sequence containing the items_, 

I don't think I ever used the phrasing "a sequence containing the 
items". I think that's *your* phrase, not mine.

I may have said "with the sequence of items" or words to that effect. 
These two phrases do have different meanings:

    x = (1, 2, 3)
    [100, 200, *x, 300]

    # Replace *x with "a sequence containing items of x"
    [100, 200, [1, 2, 3], 300]

    # Replace *x with "the sequence of items of x"
    [100, 200, 1, 2, 3, 300]

Clearly they have different meanings. I'm confident that I've always 
made it clear that I'm referring to the second, not the first, but I'm 
only human and if I've been unclear or used the wrong phrasing, my 
apologies.

But nit-picking about the exact phrasing used aside, it is clear that 
expanding the *t in a list comprehension:

    [*t for t in iterable]

to flatten the iterable cannot be analogous to this. Whatever 
explanation you give for why *t expands the list comprehension, it 
cannot be given in terms of replacing *t with the items of t. There has 
to be some magic to give it the desired special behaviour.

> and you're trying to pull a fast
> one by claiming it is by using the fact that the "equivalent loop"
> (which is and has always been a mere fiction, not a real transformation
> actually performed by the interpreter) happens to use a sequence of
> tokens that would cause a tuple to be created if a comma appears in the
> relevant position.

I don't know what "equivalent loop" you are accusing me of misusing.

The core developers have made it absolutely clear that changing the 
fundamental equivalence of list comps as syntactic sugar for:

    result = []
    for t in iterable:
        result.append(t)

is NOT NEGOTIABLE. (That is much to my disappointment -- I would love to 
introduce a "while" version of list comps to match the "if" version, but 
that's not an option.)

So regardless of whether it is a fiction or an absolute loop, Python's 
list comprehensions are categorically limited to behave equivalently to 
the loop above (modulo scope, temporary variables, etc). If you want to 
change that -- change the append to an extend, for example -- you need 
to make a case for that change which is strong enough to overcome 
Guido's ruling. (Maybe Guido will be willing to bend his ruling to allow 
extend as well.)

There are three ways to get the desired flatten() behaviour from 
a list comp. One way is to explicitly add a second loop, which has the 
benefit of already working:

    [x for t in iterable for x in t]

Another is to swap out the append for an extend:

    [*t for t in iterable]

    # real or virtual transformation, it doesn't matter
    result = []
    for t in iterable:
        result.extend(t)

And the third is to keep the append but insert an extra virtual loop:

    # real or virtual transformation, it still doesn't matter
    result = []
    for t in iterable:
        for x in t:
            result.append(x)

Neither of the second or third suggestions match the equivalent loop 
form given above. Neither the second nor third is an obvious extension 
of the way sequence unpacking works in other contexts.

[...]
> Imagine that we were talking about ordinary list displays, and for some
> reason had developed a tradition of explaining them in terms of
> "equivalent" code the way we do for comprehensions.
> 
> x = [a, b, c] is equivalent to:
> x = list()
> x.append(a)
> x.append(b)
> x.append(c)
> 
> So now if we replace c with *c [where c == [d, e]], must we now say
> this?
> x = list()
> x.append(a)
> x.append(b)
> x.append(d, e)
> 
> Well, that's just not valid at all.

Right. And if we had a tradition of saying that list displays MUST be 
equivalent to the unpacked sequence of appends, then sequence unpacking 
inside a list display would be prohibited. But we have no such 
tradition, and sequence unpacking inside the list really is an obvious 
straight line extrapolation from (say) sequence unpacking inside a 
function call.

Fortunately, we have a *different* tradition when it comes to list 
displays, and no ruling that *c must turn into append with multiple 
arguments. Our explanation of [a, b, *c] occurs at an earlier level: 
replace the *c with the items of c:

    c = [d, e]
    [a, b, *c] ==> [a, b, d, e]

And there is no problem. 

Unfortuantely for you, none of this is the case for list comps. We DO 
have a tradition and a BDFL ruling that list comps are strictly 
equivalent to a loop with append.

And the transformation of *t for the items of t (I don't care if it is a 
real transformation in the implementation, or only a fictional 
transformation) cannot work in a list comp. Let's make the number of 
items of t explicit so we don't have to worry about variable item 
counts:

    [*t for t in iterable]  # t has three items
    [a, b, c for (a, b, c) in iterable]

That's a syntax error. To avoid the syntax error, we need parentheses:

    [(a, b, c) for (a, b, c) in iterable]

and that's a no-op. So we're back to my first response to this thread: 
why on earth would you expect *t in a list comprehension to flatten the 
iterable? It should be either an error, or a no-op.

> Clearly we must reject this
> ridiculous notion of allowing starred expressions within list displays,
> because we _can't possibly_ change the transformation to accommodate the
> new feature.

Of course we can. I've repeatedly said we can do anything we want. If we 
want, we can have *t in a list comprehension be sugar for importing the 
sys module, or erasing your home directory. What we can't say is that 
"erasing your home directory" is an obvious straight-line extrapolation 
from existing uses of the star operator. There's nothing obvious here: 
this thread is proof that whatever connection (if any) between the two 
is non-obvious, twisted, even strange and bizarre.

I have never argued against this suggested functionality: flattening 
iterables is obviously a useful thing to do. But:

- we can already use a list comp to flatten: 

    [x for t in iterable for x in t]

- there's no obvious or clear connection between the *t in the suggested 
  syntax and existing uses of the star operator; it might as well be 
  spelled [magic!!!! t for t in iterable] for all the relevance sequence 
  unpacking has;

- if anyone can explain the connection they see, I'm listening;

  (believe me, I am *trying to understand* -- but none of the given 
  explanations for a connection hold up as far as I am concerned)

- even if we agree that there is a connection, this thread is 
  categorical proof that it is not obvious: it has taken DOZENS of 
  emails to (allegedly) get the message across;

- if we do get syntactic sugar for flatten(), why does it have to 
  overload the star operator for yet another meaning?

Hence my earlier questions: do we really need this, and if so, does it 
have to be spelled *t? Neither of those questions are obviously answered 
with a "Yes".

-- 
Steve