Feature suggestion: sum() ought to use a compensated summation algorithm
Duncan Booth
duncan.booth at invalid.invalid
Sun May 4 11:58:25 EDT 2008
Szabolcs Horvát <szhorvat at gmail.com> wrote:
> I thought that it would be very nice if the built-in sum() function used
> this algorithm by default. Has this been brought up before? Would this
> have any disadvantages (apart from a slight performance impact, but
> Python is a high-level language anyway ...)?
There's a thread you might find interesting:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/9eda29faf92f532e/027cef7d4429aa3a
In that thread I suggested that Python ought to implement sum by adding
together each pair of values, then each pair of results and so on. This
means that for input values of a similar magnitude the intermediate results
stay balanced. The implementation doesn't actually do all the first level
sums first, it builds up a stack containing the sum of 2^k, 2^(k-1)...2
values. Also it works for other types as well, and for strings or lists has
the advantage of minimising the number of large string or list concatenations.
If you follow that thread far enough you'll find a post by Jussi Salmela
which shows that summing adjacent pairs performs as well as Kahan summation
(he says 'almost as good' but in fact the only time they get different
results the 'correct' answer is 500000.000005 and Kahan and sumpairs get
the two nearest representable results either side: 500000.00000499998 and
500000.00000500004 respectively).
I never did get round to re-coding my sumpairs() function in C, I would be
interested to see just how much overhead it introduces (and I suspect it
would be faster than the existing sum in some cases such as for lists).
More information about the Python-list
mailing list