on this topic, as an honest-to-goodness statistician it might be nice to see more statistical modelling in scipy. i know Rpy exists, but the interface is not very pythonic. i have some "home-brew" modules for linear regression, formula building (something like R's) and a few other things. if it went into something like scipy, it might gain from the criticisms of others.... is there any interest in making the equivalent of a scipy.stats.models module? i think an easily (medium-term) achievable goal is: i) linear (least-squares) regression models with/without weights or non-diagonal covariance matrices (in R: lm + more) ii) generalized linear models (in R: glm) iii) iteratively reweighted least squares algorithms (glm is a special case), i.e. robust regression (in R: rlm). iv) ordinary least squares multivariate linear models (i.e. multivariate responses) some of these models can easily be "broadcasted", others not so easily.... further goals are more general models: classification, constrained model fitting, model selection.... for some of these things, it may not be worth duplicating R's (or other packages') efforts. -- jonathan Robert Kern wrote:
In the interest of improving the quality of the scipy.stats package, I hereby declare April and May of 2006 to be Statistics Review Months. I propose that we set ourselves a goal to review each function in stats.py and morestats.py (and a few others) for correctness and completeness of implementation by the end of May. By my count, that's about 2.5 functions every day. Surely this is a reasonable amount of effort for a rather large payoff: a robust, well-tested and thorough statistics library.
I have added a Wiki page describing the details:
http://projects.scipy.org/scipy/scipy/wiki/StatisticsReview
Barring any objections, I will be irretrievably creating the ~150 tickets or so for all of the functions to be reviewed later tonight. So if you object, act fast!
[Disclosure: this idea isn't mine. Eric Jones mentioned it to me once, and I'm just running with it.]
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society! http://www.active.com/donate/tntsvmb/tntsvmbJTaylor GO TEAM !!! ------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305