[Python-Dev] sum()

Sat Mar 12 05:02:55 CET 2005

FYI, there are a lot of ways to do accurate fp summation, but in
general people worry about it too much (except for those who don't
worry about it at all -- they're _really_ in trouble <0.5 wink>).

One clever way is to build on that whenever |x| and |y| are within a
factor of 2 of each other, x+y is exact in 754 arithmetic.  So you
never lose information (not even one bit) when adding two floats with
the same binary exponent.  That leads directly to this kind of code:

from math import frexp

class Summer:
    def __init__(self):
        self.exp2sum = {}

    def add(self, x):
        while 1:
            exp = frexp(x)[1]
            if exp in self.exp2sum:
                x += self.exp2sum.pop(exp) # exact!
            else:
                break
        self.exp2sum[exp] = x # trivially exact

    def total(self):
        items = self.exp2sum.items()
        items.sort()
        return sum((x for dummy, x in items), 0.0)

exp2sum maps a binary exponent to a float having that exponent.  If
you pass a sequence of fp numbers to .add(), then ignoring
underflow/overflow endcases, the key invariant is that the exact
(mathematical) sum of all the numbers passed to add() so far is equal
to the exact (mathematical) sum of exp2sum.values().  While it's not
obvious at first, the total number of additions performed inside add()
is typically a bit _less_ than the total number of times add() is
called.

More importantly, len(exp2sum) can never be more than about 2K.  The
worst case for that is having one input with each possible binary
exponent, like 2.**-1000 + 2.**-999 + ... + 2.**999 + 2.**1000.  No
inputs are like that in real life, and exp2sum typically has no more
than a few dozen entries.

total() then adds those, in order of increasing exponent == in order
of increasing absolute value.  This can lose information, but there is
no bad case for it, in part because there are typically so few
addends, and in part because that no two addends have the same binary
exponent implies massive cancellation can never occur.

Playing with this can show why most fp apps shouldn't worry most
often.  Example:

Get a million floats of about the same magnitude:

>>> xs = [random.random() for dummy in xrange(1000000)]

Sum them accurately:

>>> s = Summer()
>>> for x in xs:
...     s.add(x)

No information has been lost yet (if you could look at your FPU's
"inexact result" flag, you'd see that it hasn't been set yet), and the
million inputs have been squashed into just a few dozen buckets:

>>> len(s.exp2sum)
22
>>> from pprint import pprint
>>> pprint(s.exp2sum)
{-20: 8.8332388070710977e-007,
 -19: 1.4206079529399673e-006,
 -16: 1.0065260162672729e-005,
 -15: 2.4398649189794064e-005,
 -14: 5.3980784313178987e-005,
 -10: 0.00074737138777436485,
 -9: 0.0014605198999595448,
 -8: 0.003361820812962546,
 -7: 0.0063811680318408559,
 -5: 0.016214300821313588,
 -4: 0.044836286041944229,
 -2: 0.17325355843673518,
 -1: 0.46194788522906305,
 3: 6.4590200674982423,
 4: 11.684394209886134,
 5: 24.715676913177944,
 6: 49.056084672323166,
 10: 767.69329043309051,
 11: 1531.1560084859361,
 13: 6155.484212371357,
 17: 98286.760386143636,
 19: 393290.34884990752}

Add those 22, smallest to largest:

>>> s.total()
500124.06621686369

Add the originals, "left to right":

>>> sum(xs)
500124.06621685845

So they're the same to about 14 significant decimal digits.  No good
if exact pennies are important, but far more precise than real-world
measurements.

How much does sorting first help?

>>> xs.sort()
>>> sum(xs)
500124.06621685764

Not much!  It actually moved a bit in the wrong direction (which is
unusual, but them's the breaks).

Using decimal as a (much more expensive) sanity check:

>>> import decimal
>>> t = decimal.Decimal(0)
>>> for x in xs:
...     t += decimal.Decimal(repr(x))
>>> print t
500124.0662168636972127329455

Of course <wink> Summer's result is very close to that.