sum for sequences?
steve at REMOVE-THIS-cybersource.com.au
Thu Mar 25 22:11:50 CET 2010
On Thu, 25 Mar 2010 14:02:05 +0100, Alf P. Steinbach wrote:
> * Neil Cerutti:
>> On 2010-03-25, Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au>
>>>> You might not want to be so glib. The sum doc sure doesn't sound
>>>> like it should work on lists.
>>>> Returns the sum of a sequence of numbers (NOT strings) plus the
>>>> value of parameter 'start' (which defaults to 0).
>>> What part of that suggested to you that sum might not be polymorphic?
>>> Sure, it says numbers (which should be changed, in my opinion), but it
>>> doesn't specify what sort of numbers -- ints, floats, or custom types
>>> that have an __add__ method.
> I think Steven's argument is that it would be pointless for 'sum' to
> distinguish between user-defined numerical types and other types that
> happen to support '+'.
Before Python2.6, which introduced a numeric tower, Python *couldn't*
reliably distinguish between numeric types and other types that
overloaded +. Since Python discourages type-checking in favour of duck-
typing and try...except, this is seen as a good thing.
My argument is that sum isn't hard-coded to only work on the built-ins
ints or floats, but it supports any object that you can use the +
operator on. The *sole* exceptions are str and unicode (not even
UserString), and even there it is very simple to overcome the restriction:
>>> sum(['a', 'b'], '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]
>>> class S:
... def __add__(self, other):
... return other
>>> sum(['a', 'b'], S())
> However, given that it isn't restricted to numbers, the restriction wrt.
> strings is a bit perplexing in the context of modern CPython. But for
> Python implementations that don't offer the '+=' optimization it might
> help to avoid gross inefficiencies, namely quadratic time string
I agree -- the Python philosophy is to allow the user to shoot themselves
in the foot if they wish to. You're responsible for the Big Oh behaviour
of your code, not the compiler.
> However, if that hypothesis about the rationale is correct, then 'sum'
> should also be restricted to not handle tuples or lists, so forth, but
> at least the CPython implementation does.
The reasoning is that naive users are far, far more likely to try summing
a large list of strings than to try summing a large list of lists, and
therefore in practical terms the consequences of allowing sum on lists is
slight enough and rare enough to not be worth the check.
I suspect that this is just an after the fact rationalisation, and that
the real reason is that those responsible for the hand-holding in sum
merely forgot, or didn't know, that repeated addition of lists and tuples
is also O(N**2). But I've never cared enough to dig through the archives
to find out.
More information about the Python-list