Mailman 3 smoothing function - NumPy-Discussion

smoothing function

older
[JOB] Scientific software engineer...

rodrigo koblitz

15 May 2014 15 May '14

12:04 p.m.

Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1) the 's' term indicated with So is modelled as a smoothing function of De I'm looking for something close to this in python. Someone can help me? abraços, Koblitz

Attachments:

attachment.htm (text/html — 481 bytes)

Show replies by date

josef.pktd＠gmail.com

15 May 15 May

3:54 p.m.

On Thu, May 15, 2014 at 8:04 AM, rodrigo koblitz <rodrigokoblitz@gmail.com>wrote:

...

Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)

the 's' term indicated with So is modelled as a smoothing function of De

I'm looking for something close to this in python.

These kind of general questions are better asked on the scipy-user mailing list which covers more general topics than numpy-discussion. As far as I know, GAMs are not available in python, at least I never came across any. statsmodels has an ancient GAM in the sandbox that has never been connected to any smoother, since, lowess, spline and kernel regression support was missing. Nobody is working on that right now. If you have only a single nonparametric variable, then statsmodels also has partial linear model based on kernel regression, that is not cleaned up or verified, but Padarn is currently working on this. I think in this case using a penalized linear model with spline basis functions would be more efficient, but there is also nothing clean available, AFAIK. It's not too difficult to write the basic models, but it takes time to figure out the last 10% and to verify the results and write unit tests. If you make your code publicly available, then I would be very interested in a link. I'm trying to collect examples from books that have a python solution. Josef

...

Someone can help me?

abraços, Koblitz

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Nathaniel Smith

4:17 p.m.

On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz <rodrigokoblitz@gmail.com> wrote:

...

Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)

the 's' term indicated with So is modelled as a smoothing function of De

I'm looking for something close to this in python.

The closest thing that doesn't require writing your own code is probably to use patsy's [1] support for (simple unpenalized) spline basis transformations [2]. I think using statsmodels this works like: import statsmodels.formula.api as smf # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() print results.summary() To graph the resulting curve you'll want to use the results to somehow do "prediction" -- I'm not sure what the API for that looks like in statsmodels. If you need help figuring it out then the asking on the statsmodels list or stackoverflow is probably the quickest way to get help. -n [1] http://patsy.readthedocs.org/en/latest/ [2] http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtin... -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

josef.pktd＠gmail.com

4:47 p.m.

On Thu, May 15, 2014 at 12:17 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz <rodrigokoblitz@gmail.com> wrote:

...
Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ∼ s(De) + factor(ID), subset = I1)

the 's' term indicated with So is modelled as a smoothing function of De

I'm looking for something close to this in python.

The closest thing that doesn't require writing your own code is probably to use patsy's [1] support for (simple unpenalized) spline basis transformations [2]. I think using statsmodels this works like:

import statsmodels.formula.api as smf # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() print results.summary()

Nice

...

To graph the resulting curve you'll want to use the results to somehow do "prediction" -- I'm not sure what the API for that looks like in statsmodels. If you need help figuring it out then the asking on the statsmodels list or stackoverflow is probably the quickest way to get help.

seems to work (in a very simple made up example) results.predict({'De':np.arange(1,5), 'ID':['a']*4}, transform=True) #array([ 0.75 , 1.08333333, 0.75 , 0.41666667]) Josef

...

-n

[1] http://patsy.readthedocs.org/en/latest/ [2] http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtin...

-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

3898

Age (days ago)

3898

Last active (days ago)

List overview

Download

3 comments

3 participants

participants (3)

josef.pktd＠gmail.com
Nathaniel Smith
rodrigo koblitz

smoothing function

rodrigo koblitz

josef.pktd＠gmail.com

Nathaniel Smith

josef.pktd＠gmail.com

tags

participants (3)