[SciPy-User] a routine for fitting a straight line to data

Jerome Kieffer Jerome.Kieffer at esrf.fr
Mon Mar 25 02:28:52 EDT 2013


On Sun, 24 Mar 2013 21:27:09 -0400
David Pine <djpine at gmail.com> wrote:

> scipy.stats.linregress does not do the job.  First, it does not allow for weighting of the data, relative, absolute, or otherwise.  It does not return the covariance matrix, which provides estimates of the uncertainties in the fitting parameters, nor does it return chi-squared, which is the standard measure of the quality of the fit in the physical sciences.

I implemented a linear regression according to what I found in stats textbooks (and various pages of wikipedia) ... 

(arrays are 2D, considered as m datasets of length n

Sx = (w * x).sum(axis= -1)
Sy = (w * y).sum(axis= -1)
Sxx = (w * x * x).sum(axis= -1)
Sxy = (w * y * x).sum(axis= -1)
Syy = (w * y * y).sum(axis= -1)
Sw = w.sum(axis= -1)
slope = (Sw * Sxy - Sx * Sy) / (Sw * Sxx - Sx * Sx)
intercept = (Sy - Sx * slope) / Sw
df = n - 2
r_num = ssxym = (Sw * Sxy) - (Sx * Sy)
ssxm = Sw * Sxx - Sx * Sx
ssym = Sw * Syy - Sy * Sy
r_den = numpy.sqrt(ssxm * ssym)
correlationR = r_num / r_den
correlationR[r_den == 0] = 0.0
correlationR[correlationR > 1.0] = 1.0  # Numerical errors
correlationR[correlationR < -1.0] = -1.0  # Numerical errors
sterrest = numpy.sqrt((1.0 - correlationR * correlationR) * ssym / ssxm / df)

Hope this helps

-- 
Jérôme Kieffer
Data analysis unit - ESRF



More information about the SciPy-User mailing list