[Python-Dev] Fwd: [Python-ideas] stats module Was: minmax() function ...

Raymond Hettinger raymond.hettinger at gmail.com
Fri Oct 15 22:00:16 CEST 2010


Hello guys.  If you don't mind, I would like to hijack your thread :-)

ISTM, that the minmax() idea is really just an optimization request.
A single-pass minmax() is easily coded in simple, pure-python,
so really the discussion is about how to remove the loop overhead
(there isn't much you can do about the cost of the two compares
which is where most of the time would be spent anyway).

My suggestion is to aim higher.   There is no reason a single pass
couldn't also return min/max/len/sum and perhaps even other summary
statistics like sum(x**2) so that you can compute standard deviation 
and variance.

A few years ago, Guido and other python devvers supported a
proposal I made to create a stats module, but I didn't have time
to develop it.  The basic idea was that python's batteries should
include most of the functionality available on advanced student
calculators.  Another idea behind it was that we could invisibility
do-the-right-thing under the hood to help users avoid numerical
problems (i.e. math.fsum(s)/len(s) is a more accurate way to
compute an average because it doesn't lose precision when
building-up the intermediate sums).

I think the creativity and energy of this group is much better directed
at building a quality stats module (perhaps with some R-like capabilities).
That would likely be a better use of energy than bike-shedding 
about ways to speed-up a trivial piece of code that is ultimately
constrained by the cost of the compares per item.

my-two-cents,


Raymond


More information about the Python-Dev mailing list