As of NumPy v1.7, numpy.polyfit includes an option for providing weighting to data to be fit. It's a welcome addition, but the implementation seems a bit non-standard, perhaps even wrong, and I wonder if someone can enlighten me. 1. The documentation does not specify what the weighting array "w" is supposed to be. After some fooling around I figured out that it is 1/sigma, where sigma is the standard deviation uncertainty "sigma" in the y data. A better implementation, which would be consistent with how weighting is done in scipy.optimize.curve_fit, would be to specify "sigma" directly, rather than "w". This is typically what the user has at hand, this syntax is more transparent, and it would make polyfit consistent with the nonlinear fitting routine scipy.optimize.curve_fit. 2. The definition of the covariance matrix in polyfit is peculiar (or maybe wrong, I'm not sure). Towards the end of the code for polyfit, the standard covariance matrix is calculated and given the variable name "Vbase". However, instead of returning Vbase as the covariance matrix, polyfit returns the quantity Vbase * fac, where fac = resids / (len(x) - order - 2.0). fac scales as N, the number of data points, so the covariance matrix is about N times larger than it should be for large values of N; the increase is smaller for small N (of order 10). Perhaps the authors of polyfit want to use a more conservative error estimate, but the references given in the poyfit code make no mention of it. I think it would be much better to return the standard definition of the covariance matrix (see, for example, the book Numerical Recipes). David Pine
participants (4)
-
Charles R Harris
-
David Pine
-
josef.pktd@gmail.com
-
Pauli Virtanen