# [Python-Dev] sum()

Alex Martelli aleaxit at yahoo.com
Fri Mar 11 20:08:09 CET 2005

```On Mar 11, 2005, at 19:39, Raymond Hettinger wrote:

> [Alex]
>> If you're considering revisions to sum's design, my suggestion would
> be
>> to find a way to let the user tell sum to use a more accurate approach
>> when summing floats.
>
> FWIW, when accuracy is an issue, I use:
>
>    sum(sorted(data, key=abs))

...and you may still lose a LOT of accuracy that way:-(.

Raymond, you technically reviewed the 2nd ed Cookbook -- don't you
recall the sidebar about why partials are the RIGHT way to do
summations?  I was SO proud of that one, thought I'd made the issue
clearest than anywhere previously in print...

To summarize: as we all know, a+b loses significant digits from (say) b
when abs(a) is much larger than abs(b).  Now, say we're summing a
million numbers, all positive and roughly the same magnitude.  If we do
it by total += item (which is basically what sum does), by the time
we've summed the first 100,000 or so total IS *much larger than item*
in absolute value -- we're systematically tossing away about 5 digits
from each new item by that time!!!

To avoid such a massacre of digits, one uses partials -- summing items
two at a time to get half as many partials, then iterating that idea
about log2(N) times when summing N items.  Problem is, one needs O(N)
auxiliary space (assuming one can't just overwrite the incoming
list/array/whatever).

There's all other kinds of accuracy issues with a+b, but the one
partials-based summation deals with is numero uno in many real-life
situations, e.g. when one needs to get (say) the total area from N
regions, each of roughly the same magnitude, whose areas were
separately measured -- total length from N segments ditto -- total mass
of N items ditto -- and so forth.

Alex

```

More information about the Python-Dev mailing list