Covariance matrix from curve_fit
![](https://secure.gravatar.com/avatar/f586e2a9879fe0a55fd1e3ea529c664f.jpg?s=120&d=mm&r=g)
Hi everyone, I have a question regarding the output from the scipy.optimize.curve_fit function - in the following example: """ In [1]: import numpy as np In [2]: from scipy.optimize import curve_fit In [3]: f = lambda x, a, b: a * x + b In [4]: x = np.array([0., 1., 2.]) In [5]: y = np.array([1.2, 4.6, 7.8]) In [6]: e = np.array([1., 1., 1.]) In [7]: curve_fit(f, x, y, sigma=e) Out[7]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]])) In [8]: curve_fit(f, x, y, sigma=e * 100) Out[8]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]])) """ it's clear that the covariance matrix does not take into account the uncertainties on the data points. If I do: """ popt, pcov = curve_fit(...) """ Then pcov[0,0]**0.5 is therefore not the uncertainty on the parameter, so I was wondering how this should be scaled to give the actual uncertainty on the parameter? Thanks! Tom
![](https://secure.gravatar.com/avatar/8afd7ccf4695c01962ab71521a4ae323.jpg?s=120&d=mm&r=g)
On Sun, Jun 16, 2013 at 3:24 AM, Thomas Robitaille < thomas.robitaille@gmail.com> wrote:
Hi everyone,
I have a question regarding the output from the scipy.optimize.curve_fit function - in the following example:
""" In [1]: import numpy as np
In [2]: from scipy.optimize import curve_fit
In [3]: f = lambda x, a, b: a * x + b
In [4]: x = np.array([0., 1., 2.])
In [5]: y = np.array([1.2, 4.6, 7.8])
In [6]: e = np.array([1., 1., 1.])
In [7]: curve_fit(f, x, y, sigma=e) Out[7]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]]))
In [8]: curve_fit(f, x, y, sigma=e * 100) Out[8]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]])) """
it's clear that the covariance matrix does not take into account the uncertainties on the data points. If I do:
""" popt, pcov = curve_fit(...) """
Then pcov[0,0]**0.5 is therefore not the uncertainty on the parameter, so I was wondering how this should be scaled to give the actual uncertainty on the parameter?
There was a long discussion by email and then github on this: http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html https://github.com/scipy/scipy/pull/448 The open pull request has the code to do the scaling you want. - Tom
Thanks! Tom _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
![](https://secure.gravatar.com/avatar/f586e2a9879fe0a55fd1e3ea529c664f.jpg?s=120&d=mm&r=g)
Hi Tom, On 16 June 2013 12:57, Aldcroft, Thomas <aldcroft@head.cfa.harvard.edu> wrote:
On Sun, Jun 16, 2013 at 3:24 AM, Thomas Robitaille <thomas.robitaille@gmail.com> wrote:
Hi everyone,
I have a question regarding the output from the scipy.optimize.curve_fit function - in the following example:
""" In [1]: import numpy as np
In [2]: from scipy.optimize import curve_fit
In [3]: f = lambda x, a, b: a * x + b
In [4]: x = np.array([0., 1., 2.])
In [5]: y = np.array([1.2, 4.6, 7.8])
In [6]: e = np.array([1., 1., 1.])
In [7]: curve_fit(f, x, y, sigma=e) Out[7]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]]))
In [8]: curve_fit(f, x, y, sigma=e * 100) Out[8]: (array([ 3.3 , 1.23333333]), array([[ 0.00333333, -0.00333333], [-0.00333333, 0.00555556]])) """
it's clear that the covariance matrix does not take into account the uncertainties on the data points. If I do:
""" popt, pcov = curve_fit(...) """
Then pcov[0,0]**0.5 is therefore not the uncertainty on the parameter, so I was wondering how this should be scaled to give the actual uncertainty on the parameter?
There was a long discussion by email and then github on this:
http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html https://github.com/scipy/scipy/pull/448
Thanks for pointing me to this discussion and pull request - I think this pull request should be finalized, and most importantly, the documentation of curve_fit improved - at the moment, the name ``sigma`` implies that the uncertainties are 1-sigma normal deviations, which to me (and a number of other Python users I know) implies that the covariance matrix takes this into account in the parameter uncertainties. I understand that the new (lack of) scaling will have to be optional for backward-compatibility reasons, but it's unfortunate given the connotations a variable like ``sigma`` has... Cheers, Tom
The open pull request has the code to do the scaling you want.
- Tom
Thanks! Tom _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (2)
-
Aldcroft, Thomas
-
Thomas Robitaille