Feature suggestion: sum() ought to use a compensated summation algorithm
Ivan Illarionov
ivan.illarionov at gmail.com
Sat May 3 14:04:50 EDT 2008
On Sat, 03 May 2008 18:50:34 +0200, Szabolcs Horvát wrote:
> I did the following calculation: Generated a list of a million random
> numbers between 0 and 1, constructed a new list by subtracting the mean
> value from each number, and then calculated the mean again.
>
> The result should be 0, but of course it will differ from 0 slightly
> because of rounding errors.
>
> However, I noticed that the simple Python program below gives a result
> of ~ 10^-14, while an equivalent Mathematica program (also using double
> precision) gives a result of ~ 10^-17, i.e. three orders of magnitude
> more precise.
>
> Here's the program (pardon my style, I'm a newbie/occasional user):
>
> from random import random
>
> data = [random() for x in xrange(1000000)]
>
> mean = sum(data)/len(data)
> print sum(x - mean for x in data)/len(data)
>
> A little research shows that Mathematica uses a "compensated summation"
> algorithm. Indeed, using the algorithm described at
> http://en.wikipedia.org/wiki/Kahan_summation_algorithm gives us a result
> around ~ 10^-17:
>
> def compSum(arr):
> s = 0.0
> c = 0.0
> for x in arr:
> y = x-c
> t = s+y
> c = (t-s) - y
> s = t
> return s
>
> mean = compSum(data)/len(data)
> print compSum(x - mean for x in data)/len(data)
>
>
> I thought that it would be very nice if the built-in sum() function used
> this algorithm by default. Has this been brought up before? Would this
> have any disadvantages (apart from a slight performance impact, but
> Python is a high-level language anyway ...)?
>
> Szabolcs Horvát
Built-in sum should work with everything, not just floats. I think it
would be useful addition to standard math module.
If you know C you could write a patch to mathmodule.c and submit it to
Python devs.
--
Ivan
More information about the Python-list
mailing list