[SciPy-user] Fitting - Gauss-normal distribution

David Warde-Farley david.warde.farley at utoronto.ca
Mon Feb 26 13:46:22 EST 2007


Hi,

So, it should be noted that the method of moments estimators for a
Gaussian distribution are also the maximum likelihood estimators, i.e.
the ones that maximize p(data|parameters), as well as the best least
square estimator (since taking the log of the density function gives you
scaled squared distance from the mean). So optimizing iteratively is
hardly necessary in this case.

i.e. what David Huard wrote is probably what you're looking for, and the
best you're going to be able to do if your goal is just to fit a
Gaussian to the data.

David


On Mon, 2007-02-26 at 10:04 -0500, David Huard wrote:
> Hi Marian,
> 
> You could fit the normal using the method of moments:
> 
> mu = list.mean()
> sigma = list.std()
> 
> x = linspace(list.min(), list.max(), 100)
> pdf = scipy.stats.norm.pdf(x, mu, sigma)
> s = subplot(111) 
> s.plot(x,y)
> s.hist(list, 20, normed=True)
> 
> 
> David
> 
> 2007/2/26, Christian Meesters <meesters at uni-mainz.de>:
>         Hi,
>         
>         you are welcome to use this code - not tested, since it is
>         only rougly
>         translated from a more complex function of mine:
>         
>         from scipy import std
>         from scipy.optimize import leastsq
>         
>         def fit_gaussian(x_data, y_data): # ?_data should be numpy
>         arrays 
>                 # estimate the expectation value
>                 expect = y_data[argmax(x_data)]
>                 # find +/- 10 elements around the peak
>                 subxdata = x_data[expect-10:expect+11]
>                 subydata = y_data[expect-10:expect+11] 
>                 #estimate the std
>                 sigma = std([inpt for inpt subydata  in if inpt >
>         100.0])/len(subydata)**2
>         #really dirty hack!!
>                 #estimate the maximum
>                 maximum = max(y_data)
>                 #define starting paramters (as 'first guess') 
>                 parameters0 = [sigma, expect, maximum]
>         
>                 def __residuals(params, value, inpt):
>                     """
>                         calculates the resdiuals
>                     """
>                     sigma, expect, maximum = params 
>                     err = value - (maximum *
>         exp((-((inpt-expect)/sigma)**2)/2))
>                     #the equation above allows for adding a constant
>                     return err
>         
>                 def __peval(inpt, params):
>                     """ 
>                         evaluates the function
>                     """
>                     sigma, expect, maximum = params
>                     return (maximum *
>         exp((-((inpt-expect)/sigma)**2)/2))
>         
>                 #calculate fit paramters 
>                 plsq = leastsq(__residuals,  parameters0,
>         args=(subintensity,
>         subchannel))
>                 #calculate 'full width half maximum' parameter for a
>         gaussian fit
>                 fwhm = 2*sqrt(2*log(2))*plsq[0][0]
>                 return plsq[0], fwhm, subxdata , subydata,
>         __peval(subchannel,
>         plsq[0])
>         
>         The code above is not really neat, but allows for a peak
>         shifted along your
>         'y-axis'. The return value is a tuple of
>         (simga,mu,max),FWHMsubarray of x 
>         data, subarray of ydata,
>         and the fitted function as an array.
>         
>         See
>         http://www.scipy.org/Wiki/Documentation?action=AttachFile&do=get&target=scipy_tutorial.pdf
>         for more information - the description of leastsq is really
>         good!
>         
>         HTH
>         Christian
>         
>         
>         
>         
>         
>         
>         On Monday 26 February 2007 14:34, Marian Jakubik wrote:
>         > Hi, I am a SciPy newbie solving this problem: 
>         >
>         > I would like to fit data with gaussian normal
>         distribution.... First, I
>         > generated data:
>         >
>         > list=normal(0.00714,0.0005,140)
>         >
>         > Then I plot this data:
>         >
>         > pylab.hist (list,20)
>         >
>         > And at the end, I'd like to plot a gauss fit in the graph,
>         also....
>         >
>         > Could anyone help me, please?
>         >
>         > Marian
>         >
>         > This is my code:
>         >
>         > from numpy import * 
>         > from RandomArray import *
>         > import pylab as p
>         >
>         > list=normal(0.00714,0.0005,140)
>         >
>         > p.hist(list,20)
>         > p.show()
>         >
>         >
>         >
>         > _______________________________________________ 
>         > SciPy-user mailing list
>         > SciPy-user at scipy.org
>         > http://projects.scipy.org/mailman/listinfo/scipy-user
>         _______________________________________________
>         SciPy-user mailing list
>         SciPy-user at scipy.org
>         http://projects.scipy.org/mailman/listinfo/scipy-user
> 
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user




More information about the SciPy-User mailing list