[SciPy-user] scipy.optimize.leastsq and covariance matrix meaning

Bruce Southey bsouthey at gmail.com
Thu Nov 6 16:05:21 EST 2008


massimo sandal wrote:
> Hi,
>
> I have a trouble with the covariance matrix in the output of 
> scipy.optimize.leastsq . I am trying to find the estimated sigma of 
> the parameters obtained by the fit. Please bear with me since my 
> statistics knowledge is poor. I understand that the diagonal of the 
> covariance matrix should return me the variance values of each parameter.
>
> Problems are:
> 1) The variance of such parameters look unreasonably large to me, 
> despite the fact I obtain an *excellent* fit over a lot of data points 
> (and values extremly well coherent with expected).
> 2) The non-diagonal values of the covariance are also unreasonably 
> large, which lets me doubt that picking simply the diagonal values is 
> the correct thing to do.
>
> The residuals function is:
>
>         def residuals(params,y,x,T):
>             '''
>             Calculates the residuals of the fit
>             '''
>             lambd, pii=params
>
>             Kb=(1.38065e-23)
>             therm=Kb*T
>
>             err = y-( (therm*pii/4) * (((1-(x*lambd))**-2) - 1 + 
> (4*x*lambd)) )
>
>             return err
>
> For example, a common entity of values is:
> 4390808.6184609979
> 3993219683.7749424
>
> and the relative covariance matrix is
> [[  1.97019986e+29  -2.67163157e+33]
>  [ -2.67163157e+33   3.78415451e+37]]
>
> ...which concerns me.
>
> m.
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   
It is possible to be correct if the values of y are large and 
sufficiently variable. But, based on the comment on the fit and the 
correlation in the matrix above is -0.98, my expectation is that there 
is almost no error/residual variation left. The residual variance should 
be very small (sum of squared residuals divided by defree of freedom).

You don't provide enough details but your two x variables would appear 
to virtually correlated because of the very highly correlation.  There 
are other reasons, but with data etc. I can not guess.


Bruce



More information about the SciPy-User mailing list