[SciPy-User] ancova with optimize.curve_fit

Mon Dec 6 19:31:13 EST 2010

 thanks both of you,

Josef, the data that I sent is only the first 100 rows of about 1500, there should be sufficient sampling in each plot.

Skipper, I have attempted to deploy your suggestion for not linearizing the data. It seems to work. I'm a little confused at your modification if the getDiam function and I wonder if you could help me understand. The form of the equation that is being fit is:

Y= a*X^b

your version of the detDaim function:

> 
> > 
> > def getDiam(ht, *b):
> > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1)@gmail.com>@gmail.com>
> > 
> 
> 
> 
> 
Im sorry if this is an obvious question but I don't understand how this works as it seems that the "a" coefficient is missing.

Thanks again! 

-- 
Peter Tittmann

On Thursday, December 2, 2010 at 5:03 PM, josef.pktd at gmail.com wrote:

> On Thu, Dec 2, 2010 at 7:57 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> 
> >  On Thu, Dec 2, 2010 at 7:11 PM, Skipper Seabold  wrote:
> > > On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann  wrote:
> > >> getDiam is a predictor to get dbh from height. It works with curve_fit to
> > >> find coefficients a and b given datasetset of known dbh/height pairs. You
> > >> are right, what I want is dummy variables for each plot. I'll see if I can
> > >> get that worked out by revising getDiam..
> > >> Thanks again
> > >>
> > >
> > > I think it would be easier to create your dummy variables before you pass it in.
> > >
> > > You might find some of the tools in statsmodels to be helpful here.
> > > We don't yet have an ANCOVA model, but you could definitely do
> > > something like the following. Not sure if it's exactly what you want,
> > > but it should give you an idea.
> > >
> > > import numpy as np
> > > import scikits.statsmodels as sm
> > >
> > > dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None)
> > > plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True,
> > > dictnames=True)
> > >
> > > plot_dummies will be dummy variables for all of the "plot" categories,
> > > and col_map is a map from the column number to the plot just so you
> > > can be sure you know what's what.
> > >
> > > I don't see how to use your objective function though with dummy
> > > variables. What happens if the effect of one of the plots is
> > > negative, then you run into 0 ** -1.5 == inf.
> > >
> > 
> >  If you want to do NLLS and not linearize then something like this
> >  might work and still keep the dummy variables as shift parameters
> > 
> >  def getDiam(ht, *b):
> > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1)
> > 
> >  X = np.column_stack((indHtPlot, plot_dummies))
> >  Y = depDbh
> >  coefs, cov = optimize.curve_fit(getDiam, X, Y, p0= [0.]*X.shape[1])
> > @gmail.com>@gmail.com>
> > 
> 
> In the sample file there are 11 levels of the `plot` that have only a
> single observation each. I tried to use onewaygls, but statsmodels.OLS
> doesn't work if y is a scalar.
> 
> I don't know whether curvefit or optimize.leastsq will converge in
> this case, good starting values might be necessary.
> 
> Josef
> 
> 
> 
> > 
> > 
> > > You could linearize your objective function to be
> > >
> > > b*ln(ht)
> > >
> > > and do something like
> > >
> > > indHtPlot = dta['height']
> > > depDbh = dta['dbh']
> > > X = np.column_stack((np.log(indHtPlot), plot_dummies))
> > > Y = np.log(depDbh)
> > > res = sm.OLS(Y,X).fit()
> > > res.params
> > > array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099,
> > >  -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883,
> > >  -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093,
> > >  -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084,
> > >  -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831,
> > >  -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304 ])
> > >
> > > Note that your coefficient on height is now an elasticity. I'm sure
> > > I'm missing something here, but that might help you along the way.
> > >
> > > Skipper
> > >
> >  _______________________________________________
> >  SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> > 
> > 
> > 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20101206/83960350/attachment.html>