Feature suggestion: sum() ought to use a compensated summation algorithm

Sat May 3 14:04:50 EDT 2008

On Sat, 03 May 2008 18:50:34 +0200, Szabolcs Horvát wrote:

> I did the following calculation:  Generated a list of a million random
> numbers between 0 and 1, constructed a new list by subtracting the mean
> value from each number, and then calculated the mean again.
> 
> The result should be 0, but of course it will differ from 0 slightly
> because of rounding errors.
> 
> However, I noticed that the simple Python program below gives a result
> of ~ 10^-14, while an equivalent Mathematica program (also using double
> precision) gives a result of ~ 10^-17, i.e. three orders of magnitude
> more precise.
> 
> Here's the program (pardon my style, I'm a newbie/occasional user):
> 
> from random import random
> 
> data = [random() for x in xrange(1000000)]
> 
> mean = sum(data)/len(data)
> print sum(x - mean for x in data)/len(data)
> 
> A little research shows that Mathematica uses a "compensated summation"
> algorithm.  Indeed, using the algorithm described at
> http://en.wikipedia.org/wiki/Kahan_summation_algorithm gives us a result
> around ~ 10^-17:
> 
> def compSum(arr):
>      s = 0.0
>      c = 0.0
>      for x in arr:
>          y = x-c
>          t = s+y
>          c = (t-s) - y
>          s = t
>      return s
> 
> mean = compSum(data)/len(data)
> print compSum(x - mean for x in data)/len(data)
> 
> 
> I thought that it would be very nice if the built-in sum() function used
> this algorithm by default.  Has this been brought up before?  Would this
> have any disadvantages (apart from a slight performance impact, but
> Python is a high-level language anyway ...)?
> 
> Szabolcs Horvát

Built-in sum should work with everything, not just floats. I think it 
would be useful addition to standard math module. 

If you know C you could write a patch to mathmodule.c and submit it to 
Python devs.

-- 
Ivan