On Fri, Aug 7, 2009 at 6:57 PM, <josef.pktd@gmail.com> wrote:
On Fri, Aug 7, 2009 at 6:13 PM, <josef.pktd@gmail.com> wrote:
On Fri, Aug 7, 2009 at 5:42 PM, <josef.pktd@gmail.com> wrote:
On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk<HAWRYLA@novachem.com> wrote:
Hmm ... good point. It appears to give a probability distribution proportional to x**(a-1), but I see no good reason why the domain should be limited to [0,1].
def test(a): nums = plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') x = np.linspace(0,1,200) plt.plot(x,nums[0][-1]*x**(a-1))
Andrew
-----Original Message----- From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion- bounces@scipy.org] On Behalf Of alan@ajackson.org Sent: 7 Aug 2009 2:49 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Power distribution
I don't think that is it, since the one in numpy has a range restricted to the interval 0-1.
Try out hist(np.random.power(5, 1000000), bins=100)
You might get better results for 'power-law distribution' http://en.wikipedia.org/wiki/Power_law
Andrew
> -----Original Message----- > From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion- > bounces@scipy.org] On Behalf Of alan@ajackson.org > Sent: 7 Aug 2009 11:45 AM > To: Discussion of Numerical Python > Subject: [Numpy-discussion] Power distribution > > Documenting my way through the statistics modules in numpy, I ran > into the Power Distribution. > > Anyone know what that is? I Googled for it, and found a lot of stuff on > electricity, but no reference for a statistical distribution of that > name. Does it have a common alias? > > --
same is in Travis' notes on the distribution and scipy.stats.distributions domain in [0,1], but I don't know anything about it either
## Power-function distribution ## Special case of beta dist. with d =1.0
class powerlaw_gen(rv_continuous): def _pdf(self, x, a): return a*x**(a-1.0) def _cdf(self, x, a): return x**(a*1.0) def _ppf(self, q, a): return pow(q, 1.0/a) def _stats(self, a): return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) def _entropy(self, a): return 1 - 1.0/a - log(a) powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", longname="A power-function", shapes="a", extradoc="""
Power-function distribution
powerlaw.pdf(x,a) = a*x**(a-1) for 0 <= x <= 1, a > 0. """ )
it looks like it's the same distribution, even though it doesn't use the random numbers from the numpy function
high p-values with Kolmogorov-Smirnov, see below
I assume it is a truncated version of *a* powerlaw distribution, so that a can be large, which would be impossible in the open domain case. But a quick search, I only found powerlaw applications that refer to the tail behavior.
Josef
rvs = np.random.power(5, 100000) stats.kstest(rvs,'powerlaw',(5,)) (0.0021079715221341555, 0.76587118275752697) rvs = np.random.power(5, 1000000) stats.kstest(rvs,'powerlaw',(5,)) (0.00063983013407076239, 0.80757958281509501) rvs = np.random.power(0.5, 1000000) stats.kstest(rvs,'powerlaw',(0.5,)) (0.00081823148457027539, 0.51478478398950211)
I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where it is refered to as the "power-function" distribution. roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is distributed according to the power-function distribution. JKB have an extra parameter in there and is a bit more general then the scipy version, or maybe it is just the scale parameter included in the density function.
It is also in NIST data plot, but I didn't find the html reference page, but only the pdf
the pdf-files for powpdf and powcdf are here http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.h...
I can look some more a bit later tonight.
Josef
for the relationship to pareto, below are some kstests and graphs a reminder that numpy.random.pareto uses a non-standard 0 bound, instead of 1 ks tests don't show good numbers every once in a while, since they are random I checked the definitions in JKB (page 607) and my previous interpretation was correct. if X has a pareto distribution with lower bound at 1 and shape parameter a>0, then 1/X has a density function p(y) = a*y**(a-1), (0<y<1) weak inequality in JKB instead of strict as in scipy.stats.powerlaw docstring (the actual scipy.stats.powerlaw docstring has a typo, a**x**(a-1), which I will correct) Josef import numpy as np from scipy import stats import matplotlib.pyplot as plt rvs = np.random.power(5, 1000000) rvsp = np.random.pareto(5, 1000000) rvsps = stats.pareto.rvs(5, size=100) print "stats.kstest(1./rvsps,'powerlaw',(5,))" print stats.kstest(1./rvsps,'powerlaw',(5,)) print "stats.kstest(1./(1+rvsp),'powerlaw',(5,))" print stats.kstest(1./(1+rvsp),'powerlaw',(5,)) print "stats.kstest(rvs,'powerlaw',(5,))" print stats.kstest(rvs,'powerlaw',(5,)) print "stats.ks_2samp(rvs,1./(rvsp+1))" print stats.ks_2samp(rvs,1./(rvsp+1)) print "stats.ks_2samp(rvs,1./rvsps)" print stats.ks_2samp(rvs,1./rvsps) print "stats.ks_2samp(1+rvsp, rvsps)" print stats.ks_2samp(1+rvsp, rvsps) plt.figure() plt.hist(rvs, bins=50) plt.title('np.random.power(5)') plt.figure() plt.hist(1./(1.+rvsp), bins=50) plt.title('inverse of 1 + np.random.pareto(5)') plt.figure() plt.hist(1./(1.+rvsp), bins=50) plt.title('inverse of stats.pareto(5)') #plt.show()