Re: [SciPy-user] nonlinear fit with non uniform error?
As an aside, will those of you who are *more* in the know on this topic than the rest of us suggest a good text that has a worthwhile treatment of this subject (as well as other related data analysis/statistical issues)? I'd love to learn more about it, but just jumping on Amazon and picking a book at almost random seems like a good way to waste a lot of money I don't have on books that I don't need, so if you have a favorite reference or text, I'm interested in knowing about it. thanks, trevis -----Original Message----- From: scipy-user-bounces@scipy.org [mailto:scipy-user-bounces@scipy.org] On Behalf Of David Huard Sent: Thursday, June 21, 2007 8:09 AM To: SciPy Users List Subject: Re: [SciPy-user] nonlinear fit with non uniform error? Hi, What you have is an heteroscedastic normal distribution (varying variance) describing the residuals. 2007/6/21, Matthieu Brucher <matthieu.brucher@gmail.com>: 1)Does this mean that least squares is NOT ok? Yes, LS is _NOT_ OK because it assumes that the distribution (with its parameters) is the same for all errors. I don't remember exactly, but this may be due to ergodicity Well, let's put things in perspective. You can still use ordinary least-squares. Theoretically, this means you're making the assumption that the error mean and variance are fixed and constant. In your case, this is not true and you can consider the LS solution like an approximation. What will happen under this approximation is that large errors on Cy will tend to dominate the residuals, and values in Ay will probably not be fitted optimally. I advise you try it anyway and visually check whether you care about that or not. 2)What does "rescaling" mean in this context? You must change B and C so that : Ay +/- 5 B'y +/- 5 C'y +/- 5 Or maximize the likelihood of a multivariate normal distribution, whose covariance matrix describes your assumption about the heteroscedasticity of the residuals. \Sigma = | \sigma_A^2 0 0 | | 0 \sigma_B^2 0 | | 0 0 \sigma_C^2 | Heteroscedastic likelihood = -n/2 \ln(2\pi) - 1/2 \sum \ln(\sigma_i^2) -1/2 \sum \sigma_i^{-2} (y_{obs} - y_{sim})^2 You might also consider the possibility that your errors are multiplicative rather than additive. In this case, describing the residuals by a lognormal distribution could make more sense. Maximize lognormal likelihood: L=lognormal(y_sim | ln(y_obs), \sigma) Cheers, David Matthieu _______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
Trevis, 2007/6/21, Trevis Crane <t_crane@mrl.uiuc.edu>:
As an aside, will those of you who are **more** in the know on this topic than the rest of us suggest a good text that has a worthwhile treatment of this subject (as well as other related data analysis/statistical issues)?
My bible is Probability Theory : The Logic of Science by E. T. Jaynes. http://omega.albany.edu:8008/JaynesBook.html It's not so much a book about optimization and fitting than on the general principles of probability. It was worth the reading time though. There is a paper in the hydrological literature (Sorooshian and Dracup, water resources research, vol.16, no.2, 1980) that discusses the calibration of hydrologic models in correlated and heteroscedastic error cases. I guess every discipline has a paper similar to this one but this is the one I know. There is also a Book by A. Zellner, An Introduction to Bayesian Inference in Econometrics, 1971 that I found helpful. As you can see, I'm not aware of a comprehensive treatise on the subject. I just picked up bits from different articles. HTH, David I'd love to learn more about it, but just jumping on Amazon and picking a
book at almost random seems like a good way to waste a lot of money I don't have on books that I don't need, so if you have a favorite reference or text, I'm interested in knowing about it.
thanks,
trevis
-----Original Message----- *From:* scipy-user-bounces@scipy.org [mailto:scipy-user-bounces@scipy.org] *On Behalf Of *David Huard *Sent:* Thursday, June 21, 2007 8:09 AM *To:* SciPy Users List *Subject:* Re: [SciPy-user] nonlinear fit with non uniform error?
Hi,
What you have is an heteroscedastic normal distribution (varying variance) describing the residuals.
2007/6/21, Matthieu Brucher <matthieu.brucher@gmail.com>:
1)Does this mean that least squares is NOT ok?
Yes, LS is _NOT_ OK because it assumes that the distribution (with its parameters) is the same for all errors. I don't remember exactly, but this may be due to ergodicity
Well, let's put things in perspective. You can still use ordinary least-squares. Theoretically, this means you're making the assumption that the error mean and variance are fixed and constant. In your case, this is not true and you can consider the LS solution like an approximation. What will happen under this approximation is that large errors on Cy will tend to dominate the residuals, and values in Ay will probably not be fitted optimally. I advise you try it anyway and visually check whether you care about that or not.
2)What does "rescaling" mean in this context?
You must change B and C so that : Ay +/- 5 B'y +/- 5 C'y +/- 5
Or maximize the likelihood of a multivariate normal distribution, whose covariance matrix describes your assumption about the heteroscedasticity of the residuals.
\Sigma = | \sigma_A^2 0 0 | | 0 \sigma_B^2 0 | | 0 0 \sigma_C^2 |
Heteroscedastic likelihood = -n/2 \ln(2\pi) - 1/2 \sum \ln(\sigma_i^2) -1/2 \sum \sigma_i^{-2} (y_{obs} - y_{sim})^2
You might also consider the possibility that your errors are multiplicative rather than additive. In this case, describing the residuals by a lognormal distribution could make more sense.
Maximize lognormal likelihood: L=lognormal(y_sim | ln(y_obs), \sigma)
Cheers,
David
Matthieu
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user
participants (3)
-
David Huard -
John Hassler -
Trevis Crane