sum for sequences?
Steve Howell
showell30 at yahoo.com
Mon Mar 29 23:34:17 EDT 2010
On Mar 29, 8:01 pm, Steven D'Aprano
<ste... at REMOVE.THIS.cybersource.com.au> wrote:
> You don't define symmetry. You don't even give a sensible example of
> symmetry. Consequently I reject your argument that because sum is the
> obvious way to sum a lot of integers, "symmetry" implies that it should
> be the obvious way to concatenate a lot of lists.
>
You are not rejecting my argument; you are rejecting an improper
paraphrase of my argument.
My argument was that repeated use of "+" is spelled "sum" for
integers, so it's natural to expect the same name for repeated use of
"+" on lists. Python already allows for this symmetry, just SLOWLY.
>
> You are correct that building intermediate lists isn't *compulsory*,
> there are alternatives, but the alternatives themselves have costs.
> Complexity itself is a cost. sum currently has nice simple semantics,
> which means you can reason about it: sum(sequence, start) is the same as
>
> total = start
> for item in sequence:
> total = total + start
> return total
>
I could just as reasonably expect these semantics:
total = start
for item in sequence:
total += start
return total
Python does not contradict my expectations here:
>>> start = []
>>> x = sum([], start)
>>> x.append(1)
>>> start
[1]
> You don't have to care what the items in sequence are, you don't have to
> make assumptions about what methods sequence and start have (beyond
> supporting iteration and addition).
The only additional assumption I'm making is that Python can take
advantage of in-place addition, which is easy to introspect.
> Adding special cases to sum means it
> becomes more complex and harder to reason about. If you pass some other
> sequence type in the middle of a bunch of lists, what will happen? Will
> sum suddenly break, or perhaps continue to work but inefficiently?
This is mostly a red herring, as I would tend to use sum() on
sequences of homogenous types.
Python already gives me the power to shoot myself in the foot for
strings.
>>> list = [1, 2]
>>> list += "foo"
>>> list
[1, 2, 'f', 'o', 'o']
>>> lst = [1,2]
>>> lst.extend('foo')
>>> lst
[1, 2, 'f', 'o', 'o']
I'd prefer to get an exception for cases where += would do the same.
>>> start = []
>>> bogus_example = [[1, 2], None, [3]]
>>> for item in bogus_example: start += item
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not iterable
> You still need to ask these questions with existing sum, but it is
> comparatively easy to answer them: you only need to consider how the
> alternative behaves when added to a list. You don't have to think about
> the technicalities of the sum algorithm itself -- sometimes it calls +,
> sometimes extend, sometimes +=, sometimes something else
I would expect sum() to support the same contract as +=, which already
works for numerics (so no backward incompatibility), and which already
works for lists. For custom-designed classes, I would rely on the
promise that augmented assignment falls back to normal methods.
> ... which of the
> various different optimized branches will I fall into this time? Who
> knows? sum already has two branches. In my opinion, three branches is one
> too many.
As long as it falls into the branch that works, I'm happy. :)
>
> "Aggregating" lists? Not summing them? I think you've just undercut your
> argument that sum is the "obvious" way of concatenating lists.
>
> In natural language, we don't talk about "summing" lists, we talk about
> joining, concatenating or aggregating them. You have just done it
> yourself, and made my point for me.
Nor do you use "chain" or "extend."
> And this very thread started because
> somebody wanted to know what the equivalent to sum for sequences.
>
> If sum was the obvious way to concatenate sequences, this thread wouldn't
> even exist.
This thread is entitled "sum for sequences." I think you just made my
point.
More information about the Python-list
mailing list