Tuples vs. variable-length argument lists

Fri Mar 19 23:20:38 EDT 2010

On Fri, 19 Mar 2010 17:20:31 -0700, Spencer Pearson wrote:

> Hi!
> 
> This might be more of a personal-preference question than anything, but
> here goes: when is it appropriate for a function to take a list or tuple
> as input, and when should it allow a varying number of arguments? 

That depends on the function as well as your personal preference.

Of course, you can also follow the lead of max and min and accept both:

>>> max(1, 2, 3, 4) == max([1, 2, 3, 4])
True

You need to consider which is more "natural" for the semantics of the 
function.

> It seems as though the two are always interchangeable. 

Not always. Consider sum(a, b, c, d).

Should that be "sum the sequence [a, b, c, d]" or "sum the sequence [a, 
b, c] with initial value d"? You might ask what difference it makes, but 
it may make a big difference:

* if a...d are floats, there may be differences in rounding errors 
  between a+b+c+d and d+a+b+c

* if a...d are lists, the order that you do the addition matters:
  ["a", "b", "c"] + ["d"] != ["d"] + ["a", "b", "c"]

* even if a...d are integers, it may make a big difference for
  performance. Perhaps not for a mere four arguments, but watch:

>>> n = 10**1000000
>>> seq = [n] + range(10001)
>>> from timeit import Timer
>>> t1 = Timer("sum(seq)", "from __main__ import n, seq; seq.append(-n)")
>>> t2 = Timer("sum(seq, -n)", "from __main__ import n, seq")
>>> min(t1.repeat(number=1))
6.1270790100097656
>>> min(t2.repeat(number=1))
0.012988805770874023

In the first case, all the intermediate calculations are done using a 
million digit long int, in the second case it is not.

[...]
> I can't think
> of any situation where you couldn't convert from one form to the other
> with just a star or a pair of parentheses.

Of course you can convert, but consider that this is doing a conversion. 
Perhaps that's unnecessary work for your specific function? You are 
packing and unpacking a sequence into a tuple. Whether this is a good 
thing or a bad thing depends on the function.

> Is there a generally accepted convention for which method to use? Is
> there ever actually a big difference between the two that I'm not
> seeing?

If you think of the function as operating on a single argument which is a 
sequence of arbitrary length, then it is best written to take a single 
sequence argument.

If you think of the function as operating on an arbitrary number of 
arguments, then it is best written to take an arbitrary number of 
arguments.

If you think it doesn't matter, then choose whichever model is less work 
for you.

If neither seems better than the other, then choose arbitrarily.

If you don't like the idea of making an arbitrary choice, or if your 
users complain, then support both models (if possible).

-- 
Steven