I have some projects related to the gamma distribution. would you please help me with these: 1) Assuming that I have a gamma distribution dataset, how can I estimate parameters such as alpha, beta from the dataset. 2) assuming that I know the alpha and beta of the distribution, how can I get the value for a given probability. 3) assuming that I know the alpha and beta of the distribution, how can I get the probability for a given x. Thank you so much. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Jinsheng you wrote:
I have some projects related to the gamma distribution. would you please help me with these:
First, please read the documentation in Lib/stats/continuous.pdf in the source distribution. The following probably won't make sense otherwise.
1) Assuming that I have a gamma distribution dataset, how can I estimate parameters such as alpha, beta from the dataset.
I assume for the rest of this post that alpha is the shape parameter and beta is the scale parameter. Currently, the fit() method of distributions is broken. There's a problem with the way it passes arguments to the nnlf() method; that problem seems to apply to all distributions. There is also a problem with distributions like Gamma which are intrinsically positive; all of the distribution objects take a loc parameter. For intrinsically positive variates like Gamma, this really should be fixed to 0 all of the time. So you are going to have to do a little bit of this manually. from scipy import * def f(params, data): return -sum(log(stats.gamma.pdf(data, params[0], scale=params[1])) And then you can use one of the minimizers in scipy.optimize to minimize f(). That's called the "maximum likelihood method," which may or may not be appropriate for what you want to do.
2) assuming that I know the alpha and beta of the distribution, how can I get the value for a given probability.
Probability of what?
3) assuming that I know the alpha and beta of the distribution, how can I get the probability for a given x.
By definition, the probability for any given point value is 0. You can get the value of the probability *density function* by using the pdf() method of scipy.stats.gamma . If you want the probability of getting a value <= x, then the cdf() method will give you that. If you want the probability of getting a value >= x, then 1-cdf(). -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
Dear. Robert Kern, I greatly appreciate your help. I will try this and also read the documentation before any further question. Thanks, Jinsheng You
From: Robert Kern <rkern@ucsd.edu> Reply-To: SciPy Users List <scipy-user@scipy.net> To: SciPy Users List <scipy-user@scipy.net> Subject: Re: [SciPy-user] Gamma distribution questions Date: Thu, 08 Sep 2005 15:07:57 -0700
Jinsheng you wrote:
I have some projects related to the gamma distribution. would you please help me with these:
First, please read the documentation in Lib/stats/continuous.pdf in the source distribution. The following probably won't make sense otherwise.
1) Assuming that I have a gamma distribution dataset, how can I estimate parameters such as alpha, beta from the dataset.
I assume for the rest of this post that alpha is the shape parameter and beta is the scale parameter.
Currently, the fit() method of distributions is broken. There's a problem with the way it passes arguments to the nnlf() method; that problem seems to apply to all distributions. There is also a problem with distributions like Gamma which are intrinsically positive; all of the distribution objects take a loc parameter. For intrinsically positive variates like Gamma, this really should be fixed to 0 all of the time.
So you are going to have to do a little bit of this manually.
from scipy import * def f(params, data): return -sum(log(stats.gamma.pdf(data, params[0], scale=params[1]))
And then you can use one of the minimizers in scipy.optimize to minimize f(). That's called the "maximum likelihood method," which may or may not be appropriate for what you want to do.
2) assuming that I know the alpha and beta of the distribution, how can I get the value for a given probability.
Probability of what?
3) assuming that I know the alpha and beta of the distribution, how can I get the probability for a given x.
By definition, the probability for any given point value is 0. You can get the value of the probability *density function* by using the pdf() method of scipy.stats.gamma . If you want the probability of getting a value <= x, then the cdf() method will give you that. If you want the probability of getting a value >= x, then 1-cdf().
-- Robert Kern rkern@ucsd.edu
"In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
_________________________________________________________________ Is your PC infected? Get a FREE online computer virus scan from McAfee® Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
Robert Kern wrote, way back on September 8:
Currently, the fit() method of distributions is broken. There's a problem with the way it passes arguments to the nnlf() method; that problem seems to apply to all distributions.
I actually fixed this, kind of, some time ago. See http://www.scipy.net/roundup/scipy/issue230 With this fix, I can do the following: import scipy as S x=S.stats.lognorm.rvs(0.75,loc=5,scale=3,size=(500,)) (shape,ppcc)=S.stats.ppcc_plot(x,0.1,4,dist='lognorm') plot(shape,ppcc) # Notice nice local maximum near shape=0.75 (osm,osr),(scale,loc,r)=S.stats.probplot(x,sparams=0.75,dist='lognorm',fit=1) # Returns scale and loc which are tolerably close to 3 and 5, respectively. plot(osm, osr) # Notice this is fairly linear However, S.stats.ppcc_max(x,(0.1,4.),dist='lognorm') seems to return nonsense with this test data (-5 or -6), although it produces the correct value with some real data I have.
There is also a problem with distributions like Gamma which are intrinsically positive; all of the distribution objects take a loc parameter. For intrinsically positive variates like Gamma, this really should be fixed to 0 all of the time.
I think I disagree. I have sunspot group area data which are more or less lognormally (or perhaps weibull_min) distributed but which definitely have a nonzero loc parameter. When will this parameter fitting capability be in "new scipy" and in what form? What can I do to help? Steve Walton P.S. Does anyone read the Bug Tracker entries at scipy.org? My morestats.py fix is over three months old.
Stephen Walton wrote:
Robert Kern wrote, way back on September 8:
Currently, the fit() method of distributions is broken. There's a problem with the way it passes arguments to the nnlf() method; that problem seems to apply to all distributions.
I actually fixed this, kind of, some time ago. See
P.S. Does anyone read the Bug Tracker entries at scipy.org? My morestats.py fix is over three months old.
I don't read it regularly -- I read email much more. So make sure to post an announcement to this list. Also, I have been pre-occupied with the new core system. Also, the move to svn put a big delay on placing any fixes into scipy as several people lost write access. So, don't be discouraged. There have been some growing pains this summer. We also need more people who are willing to have write access to the actual source tree. Any takers? (I won't just let anybody make changes, but if you've already contributed to SciPy in some fashion, you are a likely candidate...) -Travis O.
participants (4)
-
Jinsheng you -
Robert Kern -
Stephen Walton -
Travis Oliphant