sum for sequences?

Sun Mar 28 12:56:26 EDT 2010

* Steve Howell:
> On Mar 28, 8:17 am, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
>> Steve Howell <showel... at yahoo.com> wrote:
>>> The mildly surprising part of sum() is that is does add vs. add-in-
>>> place, which leads to O(N) vs. O(1) for the inner loop calls, for
>>> certain data structures, notably lists, even though none of the
>>> intermediate results get used by the caller.  For lists, you could
>>> make a more efficient variant of sum() that clones the start value and
>>> does add-in-place.
>> Doing add-in-place isn't the only way to make sum more efficient: if you
>> assume that addition is associative (which of course the builtin sum can't)
>> then you can form partial sums. e.g. instead of calculating:
>>
>>   (((((((a + b) + c) + d) + e) + f) + g) + h)
>>
>> you calculate:
>>
>>   (((a + b) + (c + d)) + ((e + f) + (g + h)))
>>
>> Obviously this requires more space than the naive sum, but not as much as
>> you might at first expect: you only need to hold log(n) intermediates
>> values at any time.
>>
> 
> Yep, I did mention in my post that the outer loop does not *have* to
> be O(N), if you can parallelize it.  Apart from reducing
> intermediates, the divide-and-conquer method does not reduce overall
> computation time unless you have multiple processors, correct?
> 

With a single thread of execution it can reduce the time for large integers and 
custom numerical types, with n the number of final bits, from O(n^2) to O(n log n).

I don't think the parallelism here is very relevant for Python.

 From a more practical point of view, the sum efficiency could be improved by 
doing the first addition using '+' and the rest using '+=', without changing the 
behavior.

Cheers & hth.,

- Alf