[Numpy-discussion] Reducing the rounding error of np.sum
Christoph Deil
deil.christoph at googlemail.com
Thu Jul 4 14:36:07 EDT 2013
On Jul 4, 2013, at 8:15 PM, Julian Taylor <jtaylor.debian at googlemail.com> wrote:
> hi,
> numpys implementation of sum is just a simple:
> for i in d:
> sum += d[i]
>
> this suffers from rather high rounding errors in the order of the d.size
> * epsilon. Consider:
> (np.ones(50000) / 10.).sum()
> 5000.0000000006585
>
> There are numerous algorithms which reduce the error of this operation.
> E.g. python implements one in math.fsum which is accurate but slow
> compared to np.sum [0].
>
> Numpy is currently lacking a precise summation, but I think it would
> make sense to add one.
> The question is whether we go with the python approach of adding a new
> function which is slower but more precise or if we even change the
> default summation algorithm (or do nothing :) ).
>
> For a new function I guess the method used in python itself makes sense,
> its probably well chosen by the python developers (though I did not
> lookup the rational for the choice yet).
>
> For replacing the default, two algorithms come to my mind, pairwise
> summation [1] and kahan summation (compensated sum) [2].
> pairwise summation adds in pairs so usually the magnitude of the two
> operands is the same magnitude, this produces an error of O(log n *
> epsilon) for the common case.
> This algorithm has the advantage that it is almost as fast as the naive
> sum with an reasonable error.
> Problematic might be the buffering numpy does when reducing, this would
> limit the error reduction to the buffer size.
>
> kahan summation adds some extra operations to recover the rounding
> errors. This results in an error of o(epsilon).
> It is four times slower than the naive summation but this can be
> (partially) compensated by vectorizing it.
> It has the advantage of lower error, simpler implementation and
> buffering does not interfere.
>
> I did prototype implementations (only unit strides) of both in [3, 4].
>
> Any thoughts on this?
In case you are not aware, there has been some discussion on how numerically stable sum could be added to numpy here:
https://github.com/numpy/numpy/issues/2448
>
>
> [0]
> http://docs.python.org/2/library/math.html#number-theoretic-and-representation-functions
> [1] http://en.wikipedia.org/wiki/Pairwise_summationg
> [2] http://en.wikipedia.org/wiki/Kahan_summation_algorithm
> [3] https://github.com/juliantaylor/numpy/tree/pairwise
> [4] https://github.com/juliantaylor/numpy/tree/kahan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list