[SciPy-user] Fitting a distribution to some data

Jose Luis Gomez Dans josegomez at gmx.net
Wed Sep 6 12:44:04 EDT 2006


Hi,
I am trying to fit a distribution to some data points (actually, I want to test which distribution is best to model the data). Moreover, I want to estimate the distributions parameters. I am sure that the stats module has many functions to help me with this, but I am not sure I understand how to use them. 

As a test, I create some RVs using say y=stats.distributions.norm.rvs ( size=100).
 I can then find the mean and standard deviation using 
(mu,sigma)=stats.distributions.norm.fit(y)
which works fine. However, some methods do not work. For example, the pdf(self,x..) method returns an array of 0s, and similarly for the cdf() method. However, the _pdf() and _cdf() methods give the desired result. Is this a bug? Am I suppossed to use the underscore methods or the public methods? 

Also, is there some example of how to use kstest? it might be related, but if I try to test the previous data, I get the following error:
stats.kstest(y,'norm',args=(mu,sigma))
---------------------------------------------------------------------------
exceptions.TypeError                                 Traceback (most recent call last)

/home/ggjlgd/<ipython console>

/usr/lib/python2.4/site-packages/scipy/stats/stats.py in kstest(rvs, cdf, args, N)
   1721 #    D = max(D1,D2)
   1722     D = D1
-> 1723     return D, distributions.ksone.sf(D,N)
   1724
   1725 def chisquare(f_obs, f_exp=None):

/usr/lib/python2.4/site-packages/scipy/stats/distributions.py in sf(self, x, *args, **kwds)
    521         output = zeros(shape(cond),'d')
    522         insert(output,(1-cond0)*(cond1==cond1),self.badvalue)
--> 523         insert(output,cond2,1.0)
    524         goodargs = argsreduce(cond, *((x,)+args))
    525         insert(output,cond,self._sf(*goodargs))

/usr/lib/python2.4/site-packages/numpy/lib/function_base.py in insert(arr, obj, values, axis)
   1190
   1191     obj = asarray(obj, dtype=intp)
-> 1192     numnew = len(obj)
   1193     index1 = obj + arange(numnew)
   1194     index2 = setdiff1d(arange(numnew+N),index1)

TypeError: len() of unsized object


Am I doing something wrong here? Is it a bug? My versions are
Scipy: 0.5.0.2178
Numpy: 1.0b5
and I run on Kubuntu Dapper on Linux. I use Andrew Straw's packages.

Thanks
Jose
-- 


"Feel free" – 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail



More information about the SciPy-User mailing list