[SciPy-user] nonlinear fit with non uniform error?

Anne Archibald peridot.faceted at gmail.com
Thu Jun 21 13:03:36 EDT 2007


On 21/06/07, massimo sandal <massimo.sandal at unibo.it> wrote:
> Sorry, but I am quite a noob in serious data analysis (degree in
> molecular biology, sigh)...
>
> > The easiest solution is to rescale your y values by the uncertainties
> > before doing the fit.
>
> What do you mean by that?

If you're doing a linear least-squares fit, you're producing a matrix
M and searching for a set of parameters P so that M*P is as close to
your measured values Y as possible, in a least-squares sense. (If
you're fitting a straight line, for example, P is the two-element
vector [m,b], and M is the matrix whose first column is the x values
and whose second column is all ones.)

In order to adjust the relative importance of the different data
points, you can divide a row of M and the corresponding row of Y by
the same constant. This will have the effect of changing the relative
importance of this row compared to the others. In numpy terminology,
if U is the vector of uncertainties on the Y values, you want to
replace
P = lstsq(A,Y)
with
P = lstsq((1/U)[:,newaxis]*A,(1/U)*Y)

Effectively, this rescales all your measurements so that they have the
same uncertainty. Think of it as writing all your measurements in
units of one sigma.

If you're doing nonlinear least squares, a similar trick is still
possible, though of course it is no longer matrix multiplication.

> > Now, if your errors are not Gaussian, least-squares is no longer the
> > correct approach and your life becomes more difficult...
>
> In which sense not Gaussian? In the sense that for each point, the
> uncertainity is not Gaussian distributed? It should at least with good
> approximation be. If it is in another sense, please explain...

No, that's what I meant. There's no need to get into all this in your
case, it seems. But in astronomical data, it is very common for the
distribution of values to not be Gaussian at all - Poisson errors are
probably the most common, though chi-squared and other more exotic
distributions crop up too. In these cases, least squares fitting
simply gives the wrong answer (though sometimes it's not too wrong, to
astronomical accuracy).

Anne



More information about the SciPy-User mailing list