Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1) the 's' term indicated with So is modelled as a smoothing function of De I'm looking for something close to this in python. Someone can help me? abraços, Koblitz
On Thu, May 15, 2014 at 8:04 AM, rodrigo koblitz <rodrigokoblitz@gmail.com>wrote:
Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)
the 's' term indicated with So is modelled as a smoothing function of De
I'm looking for something close to this in python.
These kind of general questions are better asked on the scipy-user mailing list which covers more general topics than numpy-discussion. As far as I know, GAMs are not available in python, at least I never came across any. statsmodels has an ancient GAM in the sandbox that has never been connected to any smoother, since, lowess, spline and kernel regression support was missing. Nobody is working on that right now. If you have only a single nonparametric variable, then statsmodels also has partial linear model based on kernel regression, that is not cleaned up or verified, but Padarn is currently working on this. I think in this case using a penalized linear model with spline basis functions would be more efficient, but there is also nothing clean available, AFAIK. It's not too difficult to write the basic models, but it takes time to figure out the last 10% and to verify the results and write unit tests. If you make your code publicly available, then I would be very interested in a link. I'm trying to collect examples from books that have a python solution. Josef
Someone can help me?
abraços, Koblitz
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz <rodrigokoblitz@gmail.com> wrote:
Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)
the 's' term indicated with So is modelled as a smoothing function of De
I'm looking for something close to this in python.
The closest thing that doesn't require writing your own code is probably to use patsy's [1] support for (simple unpenalized) spline basis transformations [2]. I think using statsmodels this works like: import statsmodels.formula.api as smf # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() print results.summary() To graph the resulting curve you'll want to use the results to somehow do "prediction" -- I'm not sure what the API for that looks like in statsmodels. If you need help figuring it out then the asking on the statsmodels list or stackoverflow is probably the quickest way to get help. -n [1] http://patsy.readthedocs.org/en/latest/ [2] http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtin... -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
On Thu, May 15, 2014 at 12:17 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz <rodrigokoblitz@gmail.com> wrote:
Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)
the 's' term indicated with So is modelled as a smoothing function of De
I'm looking for something close to this in python.
The closest thing that doesn't require writing your own code is probably to use patsy's [1] support for (simple unpenalized) spline basis transformations [2]. I think using statsmodels this works like:
import statsmodels.formula.api as smf # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() print results.summary()
Nice
To graph the resulting curve you'll want to use the results to somehow do "prediction" -- I'm not sure what the API for that looks like in statsmodels. If you need help figuring it out then the asking on the statsmodels list or stackoverflow is probably the quickest way to get help.
seems to work (in a very simple made up example) results.predict({'De':np.arange(1,5), 'ID':['a']*4}, transform=True) #array([ 0.75 , 1.08333333, 0.75 , 0.41666667]) Josef
-n
[1] http://patsy.readthedocs.org/en/latest/ [2] http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtin...
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
josef.pktd@gmail.com
-
Nathaniel Smith
-
rodrigo koblitz