[SciPy-User] adding linear fitting routine
David J Pine
djpine at gmail.com
Wed Dec 4 17:00:31 EST 2013
Ok, here are my thoughts about how to do the returns. They are informed by
(1) the speed of linfit and (2) the above discussion.
(1) Speed. linfit runs fastest when the residuals are not calculated.
Calculating the residuals generally slows down linfit by a factor of 2 to
3 -- and it's the only thing that really slows it down. After that, all
additional calculations consume negligible time. The residuals are
calculated if: (a) residuals=True or chisq=True, or (b) if cov=True AND
relsigma=True. Note that (b) means that the residuals are not calculated
when cov=True AND relsigma=False (and residuals=False or chisq=False).
(2) The consensus of the discussion seems to be that when a lot of things
are returned by linfit, it's better to return everything as a dictionary.
So here is what I propose:
If the user only wants the optimal fitting parameters, or the user wants
only the optimal fitting parameters and the covariance matrix, these can be
returns as arrays.
Otherwise, everything is returned, and returned as a dictionary.
If we adopted this, then the only question is what the default setting
would be, say return_all=False or return_all=True. I guess I would opt for
return_all=False, the less verbose return option.
Adopting these way of doing things would simplify the arguments of linfit,
which would now look like
linfit(x, y, sigmay=None, relsigma=True, return_all=False)
I would also modify linfit to calculate the r-value, p-value, and the
stderr, which would all be returned in dictionary format when
return_all=True.
How does this sound?
David
On Wed, Dec 4, 2013 at 9:15 PM, Matt Newville <newville at cars.uchicago.edu>wrote:
> Hi David,
>
> On Wed, Dec 4, 2013 at 1:13 PM, David J Pine <djpine at gmail.com> wrote:
> > I guess my preference would be to write have linfit() be as similar to
> > curve_fit() in outputs (and inputs in so far as it makes sense), and
> then if
> > we decide we prefer another way of doing either the inputs or the
> outputs,
> > then to do them in concert. I think there is real value in making the
> user
> > interfaces of linfit() and curve_fit() consistent--it make the user's
> > experience so much less confusing. As of right now, I am agnostic about
> > whether or not the function returns a dictionary of results--although I
> am
> > unsure of what you have in mind. How would you structure a dictionary of
> > results?
> >
>
> Using return (pbest, covar) seems reasonable. But, if you
> returned a dictionary, you could include a chi-square statistic and a
> residuals array.
>
> scipy.optimize.leastsq() returns 5 items: (pbest, covar, infodict, mesg,
> ier)
> with infodict being a dict with items 'nfev', 'fvec', 'fjac', 'ipvt',
> and 'qtf'. I think it's too late to change it, but it would have
> been nicer (IMHO) if it had returned a single dict instead:
>
> return {'best_values': pbest, 'covar': covar, 'nfev':
> infodict['nfev'], 'fvec': infodict['fvec'],
> 'fjac': infodict['fjac'], 'ipvt': infodict['ipvt'],
> 'qtf': infodict['qtf'], 'mesg': mesg, 'ier': ier}
>
> Similarly, linregress() returns a 5 element tuple. The problem with
> these is that you end up with long assignments
> slope, intercept, r_value, p_value, stderr =
> scipy.stats.linregress(xdata, ydata)
>
> in fact, you sort of have to do this, even for a quick and dirty
> result when slope and intercept are all that would be used later on.
> The central problem is these 5 returned values are now in your local
> namespace, but they are not really independent values. Instead, you
> could think about
> regression = scipy.stats.linregress(xdata, ydata)
>
> and get to any of the values from computing the regression you want.
> In short, if you
> had linfit() return a dictionary of values, you could put many
> statistics in it, and people who wanted to ignore some of them would
> be able to do so.
>
> FWIW, a named tuple would be fine alternative. I don't know if
> backward compatibility would prevent that in scipy. Anyway, it's
> just a suggestion....
>
> --Matt
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20131204/1e4156c1/attachment.html>
More information about the SciPy-User
mailing list