[SciPy-User] How to fit data with errorbars

Tue Feb 16 22:33:34 EST 2010

On Tue, Feb 16, 2010 at 10:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, Feb 16, 2010 at 10:08 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
>> I have some data with some errobars that I need to fit to a line.  Is
>> there anyway to use scipy.polyfit with the error associated with the
>> data?  If not, how can I make a fit routine work with my data?
>
> If the error is just in the y (dependent) variable, then this may be
> quite simple. If your y values:
>  -- have a normally distributed error
>  -- whose standard deviation varies from point to point
>  -- but the standard deviations are known (up to a multiplicative constant)
> then it's a classic problem -- perhaps google "heterskedasticity
> correction" or "weighted least squares" or "generalized least
> squares". But basically, just make a model matrix X that has one row
> for each observation, and one column for each predictor. If you want
> to do, say, third-order polynomial fitting and you have two vectors of
> independent variables x1 and x2 and you want to include an intercept,
> then it's something like (all untested):
>
> X = np.column_stack([np.ones(len(x1)), x1, x1 ** 2, x1 ** 3, x2, x2 **
> 2, x2 ** 3])
>
> Now a normal least squares fit, ignoring the errorbars, would be:
>
> beta = np.linalg.lstsq(X, y)[0]
>
> with predicted values np.dot(X, beta).
>
> If you have the standard deviation for each measurement in a vector
> stddevs, then the proper heteroskedasticity-corrected fit is just:
>
> beta = np.linalg.lstsq((1 / stddevs) * X, (1 / stddevs) * y)[0]
>
> -- Nathaniel
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>