[SciPy-User] ancova with optimize.curve_fit

Mon Dec 6 19:41:09 EST 2010

On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann <ptittmann at gmail.com> wrote:
> thanks both of you,
> Josef, the data that I sent is only the first 100 rows of about 1500, there
> should be sufficient sampling in each plot.
> Skipper, I have attempted to deploy your suggestion for not linearizing the
> data. It seems to work. I'm a little confused at your modification if the
> getDiam function and I wonder if you could help me understand. The form of
> the equation that is being fit is:
> Y= a*X^b
> your version of the detDaim function:
>
> def getDiam(ht, *b):
>    return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1)
>
> Im sorry if this is an obvious question but I don't understand how this
> works as it seems that the "a" coefficient is missing.
> Thanks again!

Right.  I took out the 'a', because as I read it when I linearized (I
might be misunderstanding ancova, I never recall the details), if you
include 'a' and also all of the dummy variables for the plot, then you
will have a the problem of multicollinearity.  You could also include
'a' and drop one of the plot dummies, but then 'a' is just your
reference category that you dropped.  So now b[0] is the nonlinear
effect of your main variable and b[1:] contains linear shift effects
of all the plots.  Hmm, thinking about it some more, though I think
you could include 'a' in the non-linear version above (call it b[0]
and shift everything else over by one), because now 'a' would be the
effect when the current b[0] is zero.  I was just unsure how you meant
'a' when you had a*ht**b and were trying to include in ht the plot
variable dummies.

Skipper