[Python-Dev] Fwd: [Python-ideas] stats module Was: minmax() function ...
geremy condra
debatem1 at gmail.com
Sat Oct 16 02:05:26 CEST 2010
On Fri, Oct 15, 2010 at 1:00 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> Hello guys. If you don't mind, I would like to hijack your thread :-)
>
> ISTM, that the minmax() idea is really just an optimization request.
> A single-pass minmax() is easily coded in simple, pure-python,
> so really the discussion is about how to remove the loop overhead
> (there isn't much you can do about the cost of the two compares
> which is where most of the time would be spent anyway).
>
> My suggestion is to aim higher. There is no reason a single pass
> couldn't also return min/max/len/sum and perhaps even other summary
> statistics like sum(x**2) so that you can compute standard deviation
> and variance.
+1 from me. Here's a normal cdf and chi squared cdf approximation I
use for randomness testing. They may need to refined for inclusion,
but you're welcome to use them if you'd like.
from math import sqrt, erf
def normal_cdf(x, mu=0, sigma=1):
"""Approximates the normal cumulative distribution"""
return (1/2) * (1 + erf((x+mu)/(sigma*sqrt(2))))
def chi_squared_cdf(x, k):
"""Approximates the cumulative chi-squared statistic with k degrees
of freedom."""
numerator = 1 - (2/(9*k)) - ((x/k)**(1/3))
denominator = (1/3) * sqrt(2/k)
return normal_cdf(numerator/denominator)
> A few years ago, Guido and other python devvers supported a
> proposal I made to create a stats module, but I didn't have time
> to develop it. The basic idea was that python's batteries should
> include most of the functionality available on advanced student
> calculators. Another idea behind it was that we could invisibility
> do-the-right-thing under the hood to help users avoid numerical
> problems (i.e. math.fsum(s)/len(s) is a more accurate way to
> compute an average because it doesn't lose precision when
> building-up the intermediate sums).
Can you give some other examples? Sage does some of this and I
frequently find it annoying, actually, but I'm not sure if you're
referring to the same things there.
Geremy Condra
More information about the Python-Dev
mailing list