[Python-ideas] statistics.sum [was Re: Pre-PEP: adding a statistics module to Python]

Wed Aug 7 05:14:58 CEST 2013

David Mertz writes:

 > However, one name that hasn't been mentioned might be even
 > better: statistics._sum().

-1

statistics.sum() is needed any time you want to take the sum of a
function of the difference of series with similar means, because
you're likely to get a large number of small differences, and a few
large differences, with the large differences pretty much offsetting
each other.  The canonical example is the series of differences of a
series and its mean (ie, the average of the squares of that series is
the variance, which is why statistics.sum is needed internally to
Steven's package), but such constructions occur frequently in
statistical analysis.  One example is in linear regression with nearly
collinear regressors.  Another is in "standardizing" variates to have
mean zero and variance one.  (Perhaps that is -- or should be --
included in the statistics package, but it seems to violate the "not
every 3-line function" rule of thumb.)

So it should be a "public" name to encourage people to use it.