[Python-Dev] Extending tuple unpacking

Tue Oct 11 16:51:30 CEST 2005

Greg Ewing wrote:
> Guido van Rossum wrote:
> 
> 
>>BTW, what should
>>
>>    [a, b, *rest] = (1, 2, 3, 4, 5)
>>
>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?
> 
> 
> Whatever type is chosen, it should be the same type, always.
> The rhs could be any iterable, not just a tuple or a list.
> Making a special case of preserving one or two types doesn't
> seem worth it to me.

And, for consistency with functions, the type chosen should be a tuple.

I'm also trying to figure out why you would ever write:
   [a, b, c, d] = seq

instead of:
   a, b, c, d = seq

or:
   (a, b, c, d) = seq

It's not like the square brackets generate different code:

Py> def foo():
...     x, y = 1, 2
...     (x, y) = 1, 2
...     [x, y] = 1, 2
...
Py> dis.dis(foo)
   2           0 LOAD_CONST               3 ((1, 2))
               3 UNPACK_SEQUENCE          2
               6 STORE_FAST               1 (x)
               9 STORE_FAST               0 (y)

   3          12 LOAD_CONST               4 ((1, 2))
              15 UNPACK_SEQUENCE          2
              18 STORE_FAST               1 (x)
              21 STORE_FAST               0 (y)

   4          24 LOAD_CONST               5 ((1, 2))
              27 UNPACK_SEQUENCE          2
              30 STORE_FAST               1 (x)
              33 STORE_FAST               0 (y)
              36 LOAD_CONST               0 (None)
              39 RETURN_VALUE

So my vote would actually go for deprecating the use of square brackets to 
surround an assignment target list - it makes it look like an actual list 
object should be involved somewhere, but there isn't one.

>>? And then perhaps
>>
>>    *rest = x
>>
>>should mean
>>
>>    rest = tuple(x)
>>
>>Or should that be disallowed
> 
> Why bother? What harm would result from the ability to write that?

Given that:
   def foo(*args):
       print args

is legal, I would have no problem with "*rest = x" being legal.

>>There certainly is a need for doing the same from the end:
>>
>>    *rest, a, b = (1, 2, 3, 4, 5)
> 
> 
> I wouldn't mind at all if *rest were only allowed at the end.
> There's a pragmatic reason for that if nothing else: the rhs
> can be any iterable, and there's no easy way of getting "all
> but the last n" items from a general iterable.

Agreed. The goal here is to make the name binding rules consistent between for 
loops, tuple assigment and function entry, not to create different rules.

>>Where does it stop?
> For me, it stops with *rest only allowed at the end, and
> always yielding a predictable type (which could be either tuple
> or list, I don't care).

For me, it stops when the rules for positional name binding are more 
consistent across operations that bind names (although complete consistency 
isn't possible, given that function calls don't unpack sequences automatically).

Firstly, let's list the operations that permit name binding to a list of 
identifiers:
   - binding of function parameters to function arguments
   - binding of assignment target list to assigned sequence
   - binding of iteration variables to iteration values

However, that function argument case needs to be recognised as a two step 
operation, whereby the arguments are *always* packed into a tuple before being 
bound to the parameters.

That is something very vaguely like:
   if numargs > 0:
     if numargs == 1:
       argtuple = args, # One argument gives singleton tuple
     else:
       argtuple = args # More arguments gives appropriate tuple
     argtuple += tuple(starargs) # Extended arguments are added to the tuple
     param1, param2, *rest = argtuple # Tuple is unpacked to parameters

This means that the current behaviour of function parameters is actually the 
same as assignment target lists and iteration variables, in that the argument 
tuple is *always* unpacked into the parameter list - the only difference is 
that a single argument is always considered a singleton tuple. You can get the 
same behaviour with target lists and iteration variables by only using tuples 
of identifiers as targets (i.e., use "x," rather than just "x").

So the proposal at this stage is simply to mimic the unpacking of the argument 
tuple into the formal parameter list in the other two name list binding cases, 
such that the pseudocode above would actually do the same thing as building an 
argument list and binding it to its formal parameters does.

Now, when it came to tuple *packing* syntax (i.e., extended call syntax) The 
appropriate behaviour would be for:

   1, 2, 3, *range(10)

to translate (roughly) to:

   (1, 2, 3) + tuple(range(10))

However, given that the equivalent code works just fine anywhere it really 
matters (assignment value, return value, yield value), and is clearer about 
what is going on, this option is probably worth avoiding.

>>BTW, and quite unrelated, I've always felt uncomfortable that you have to write
>>
>>    f(a, b, foo=1, bar=2, *args, **kwds)
>>
>>I've always wanted to write that as
>>
>>    f(a, b, *args, foo=1, bar=2, **kwds)
> 
> 
> Yes, I'd like that too, with the additional meaning that
> foo and bar can only be specified by keyword, not by
> position.

Indeed. It's a (minor) pain that optional flag variables and variable length 
argument lists are currently mutually exclusive. Although, if you had that 
rule, I'd want to be able to write:

   def f(a, b, *, foo=1, bar=2): pass

to get a function which required exactly two positional arguments, but had a 
couple of optional keyword arguments, rather than having to do:

   def f(a, b, *args, foo=1, bar=2):
     if args:
       raise TypeError("f() takes exactly 2 positional arguments (%d given)",
                        2 + len(args))

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com