Any SPLUS to scipy ideas for lm and summary(lm)?
Hi Scipy-ers, I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.) model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient (SEC) see <- model.kt$sigma # Standard error of the Equation (SEE) Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me. Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several.... Thx, sorry for such an open ended question. W
actually, i have some implementation of the model formula stuff in python, and some linear model stuff. i hope to contribute to scipy soon.... there was a brief discussion of this on scipy-dev over the past two weeks and it seems there is some interest in getting this stuff into scipy. -- jonathan Webb Sprague wrote:
Hi Scipy-ers,
I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.)
model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient (SEC) see <- model.kt$sigma # Standard error of the Equation (SEE)
Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me.
Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several....
Thx, sorry for such an open ended question. W
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society! http://www.active.com/donate/tntsvmb/tntsvmbJTaylor GO TEAM !!! ------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
Hi All I can only offer my services as a tester for scipy-stats, but better statistics in Scipy would be great. If we do go ahead with a new and improved stats package, I think a lot of up front design work would be great (I can help some with that, even if real statistical programming is beyond me). R/SPLUS seems to have grown partly by accretion and some of it is pretty ugly, especially wrt naming conventions. However, a lot of it is really great and would serve as a good model. I also think that a data.frame type of data type would be great. If we could concentrate on that and a really good (general) linear model framework we would be making great progress, I think. It is funny that if you grep the scipy/stats directory for "residual" you get nothing :)... Cheers W On 4/10/06, Jonathan Taylor <jonathan.taylor@stanford.edu> wrote:
actually, i have some implementation of the model formula stuff in python, and some linear model stuff. i hope to contribute to scipy soon.... there was a brief discussion of this on scipy-dev over the past two weeks and it seems there is some interest in getting this stuff into scipy.
-- jonathan
Webb Sprague wrote:
Hi Scipy-ers,
I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.)
model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient (SEC) see <- model.kt$sigma # Standard error of the Equation (SEE)
Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me.
Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several....
Thx, sorry for such an open ended question. W
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society!
http://www.active.com/donate/tntsvmb/tntsvmbJTaylor
GO TEAM !!!
------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
Hi Webb, As Jonathan mentioned the devs are currently in the process of over-hauling the stats package. You should probably take a look at what they're working on and make some comments/contributions if you wish. Go to the SciPy-dev mail list: http://www.scipy.net/mailman/listinfo/scipy-dev Scipy-dev@scipy.net On 4/11/06, Webb Sprague <webb.sprague@gmail.com> wrote:
Hi All
I can only offer my services as a tester for scipy-stats, but better statistics in Scipy would be great.
If we do go ahead with a new and improved stats package, I think a lot of up front design work would be great (I can help some with that, even if real statistical programming is beyond me). R/SPLUS seems to have grown partly by accretion and some of it is pretty ugly, especially wrt naming conventions. However, a lot of it is really great and would serve as a good model. I also think that a data.frame type of data type would be great. If we could concentrate on that and a really good (general) linear model framework we would be making great progress, I think.
It is funny that if you grep the scipy/stats directory for "residual" you get nothing :)...
Cheers W
On 4/10/06, Jonathan Taylor <jonathan.taylor@stanford.edu> wrote:
actually, i have some implementation of the model formula stuff in python, and some linear model stuff. i hope to contribute to scipy soon.... there was a brief discussion of this on scipy-dev over the past two weeks and it seems there is some interest in getting this stuff into scipy.
-- jonathan
Webb Sprague wrote:
Hi Scipy-ers,
I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.)
model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient (SEC) see <- model.kt$sigma # Standard error of the Equation (SEE)
Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me.
Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several....
Thx, sorry for such an open ended question. W
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society!
http://www.active.com/donate/tntsvmb/tntsvmbJTaylor
GO TEAM !!!
------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
glad to hear there is interest in this. sorry i haven't got around to making my linear model/ formula stuff ready yet. hope to do it by early next week -- jonathan dHering wrote:
Hi Webb,
As Jonathan mentioned the devs are currently in the process of over-hauling the stats package. You should probably take a look at what they're working on and make some comments/contributions if you wish.
Go to the SciPy-dev mail list: http://www.scipy.net/mailman/listinfo/scipy-dev
Scipy-dev@scipy.net
On 4/11/06, Webb Sprague <webb.sprague@gmail.com> wrote:
Hi All
I can only offer my services as a tester for scipy-stats, but better statistics in Scipy would be great.
If we do go ahead with a new and improved stats package, I think a lot of up front design work would be great (I can help some with that, even if real statistical programming is beyond me). R/SPLUS seems to have grown partly by accretion and some of it is pretty ugly, especially wrt naming conventions. However, a lot of it is really great and would serve as a good model. I also think that a data.frame type of data type would be great. If we could concentrate on that and a really good (general) linear model framework we would be making great progress, I think.
It is funny that if you grep the scipy/stats directory for "residual" you get nothing :)...
Cheers W
On 4/10/06, Jonathan Taylor <jonathan.taylor@stanford.edu> wrote:
actually, i have some implementation of the model formula stuff in python, and some linear model stuff. i hope to contribute to scipy soon.... there was a brief discussion of this on scipy-dev over the past two weeks and it seems there is some interest in getting this stuff into scipy.
-- jonathan
Webb Sprague wrote:
Hi Scipy-ers,
I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.)
model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient
(SEC)
see <- model.kt$sigma # Standard error of the Equation (SEE)
Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me.
Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several....
Thx, sorry for such an open ended question. W
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society!
http://www.active.com/donate/tntsvmb/tntsvmbJTaylor
GO TEAM !!!
------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society! http://www.active.com/donate/tntsvmb/tntsvmbJTaylor GO TEAM !!! ------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
On 4/11/06, Webb Sprague <webb.sprague@gmail.com> wrote:
Hi Scipy-ers,
I would like to duplicate the following piece of SPLUS/R code in Python-scipy, and would love somebody smarter than me to give me some ideas. (If you don't know SPLUS/R, you may not want to bother with this.)
model.kt <- summary(lm(kt.diff ~ 1 )) kt.drift <- model.kt$coefficients[1,1] # Coefficient sec <- model.kt$coefficients[1,2] # Standard Error of the Coefficient (SEC) see <- model.kt$sigma # Standard error of the Equation (SEE)
Getting a least-squares fit in scipy is not a problem, but getting all that other nice stuff IS kind of a problem. I don't mind either hacking scipy.stats, or writing my own function, but maybe someone has some ideas for this, maybe it can be contributed, or ???. I also realize that the SPLUS formula notation doesn't exist at all in scipy-Python, so no need to point that out to me.
Perhaps there should be a scipy.stats working group? It seems like scipy.stats (not including the probability distributions and basic summary functions, which are fine) is kind of a forgotten stepchild in scipy, and probably needs a nurturing aunt or uncle or several....
I don't have anything really useful to say, other than to say that I would also like a stronger focus on stats in scipy. I typically use R at the moment but would prefer to use scipy. I find the data.frame data type in R/Splus particularly helpful for the type of statistical analysis I undertake (basically something like a masked recarray with ability to have col and row names). In my spare time I am working on trying to make something similar for numpy. Mike
participants (4)
-
dHering
-
Jonathan Taylor
-
Michael Sorich
-
Webb Sprague