[Python-Dev] Fwd: [Python-ideas] stats module Was: minmax() function ...

geremy condra debatem1 at gmail.com
Sat Oct 16 02:05:26 CEST 2010


On Fri, Oct 15, 2010 at 1:00 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> Hello guys.  If you don't mind, I would like to hijack your thread :-)
>
> ISTM, that the minmax() idea is really just an optimization request.
> A single-pass minmax() is easily coded in simple, pure-python,
> so really the discussion is about how to remove the loop overhead
> (there isn't much you can do about the cost of the two compares
> which is where most of the time would be spent anyway).
>
> My suggestion is to aim higher.   There is no reason a single pass
> couldn't also return min/max/len/sum and perhaps even other summary
> statistics like sum(x**2) so that you can compute standard deviation
> and variance.

+1 from me. Here's a normal cdf and chi squared cdf approximation I
use for randomness testing. They may need to refined for inclusion,
but you're welcome to use them if you'd like.

from math import sqrt, erf

def normal_cdf(x, mu=0, sigma=1):
	"""Approximates the normal cumulative distribution"""
	return (1/2) * (1 + erf((x+mu)/(sigma*sqrt(2))))

def chi_squared_cdf(x, k):
	"""Approximates the cumulative chi-squared statistic with k degrees
of freedom."""
	numerator = 1 - (2/(9*k)) - ((x/k)**(1/3))
	denominator = (1/3) * sqrt(2/k)
	return normal_cdf(numerator/denominator)

> A few years ago, Guido and other python devvers supported a
> proposal I made to create a stats module, but I didn't have time
> to develop it.  The basic idea was that python's batteries should
> include most of the functionality available on advanced student
> calculators.  Another idea behind it was that we could invisibility
> do-the-right-thing under the hood to help users avoid numerical
> problems (i.e. math.fsum(s)/len(s) is a more accurate way to
> compute an average because it doesn't lose precision when
> building-up the intermediate sums).

Can you give some other examples? Sage does some of this and I
frequently find it annoying, actually, but I'm not sure if you're
referring to the same things there.

Geremy Condra


More information about the Python-Dev mailing list