[SciPy-user] incorrect variance in optimize.curvefit and leastsq

Thu Feb 12 22:16:26 EST 2009

I just saw the new optimize.curvefit which provides a wrapper around
optimize.leastsq

optimize.leastsq provides the raw covariance matrix (cov_x). As I
mentioned once on the mailing list, this is not the covariance matrix
of the parameter estimates. To get those, the raw covariance matrix
has to be multiplied by the standard error of the residual. So, the
docstring in optimize.curvefit doesn't correspond to the actual
calculation.

I'm preparing a test against an example from the NIST certified cases:
http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml

>>> SSE=np.sum((y-yh)**2)

difference in standard deviation of the parameter estimates compared to NIST:

>>> np.sqrt(SSE/12.0)*np.sqrt(np.diag(pcov))[0] - 3.1643950207E+00
2.4865573906573957e-006
>>> np.sqrt(SSE/12.0)*np.sqrt(np.diag(pcov))[1] - 4.2547321834E-06
3.6275492099440408e-012

The first parameter is not exactly high precision.

The second problem is that, in weighted least squares, the calculation
of the standard deviation of the parameter estimates has to take the
weights into account. (But I don't have the formulas right now.)

I was looking at this to provide a general non-linear least squares
class in stats. But for several calculation, the Jacobian would be
necessary. optimize.leastsq only provides cov_x, but I was wondering
whether the Jacobian can be calculated from the return of the minpack
functions in optimize.leastsq, but I didn't have time to figure this
out.

Josef