On Wed, Jan 4, 2012 at 10:46 PM, Fernando Perez <fperez.net@gmail.com> wrote:
On Wed, Jan 4, 2012 at 7:29 PM, Travis Oliphant <travis@continuum.io> wrote:
It seems like scipy stats has received quite a bit of attention. There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work.
Well, I recently needed to do some simple linear modeling, and the stats glm docstring isn't very encouraging:
Docstring: Calculates a linear model fit ... anova/ancova/lin-regress/t-test/etc. Taken from:
Peterson et al. Statistical limitations in functional neuroimaging I. Non-inferential methods and statistical models. Phil Trans Royal Soc Lond B 354: 1239-1260.
Returns ------- statistic, p-value ???
### END of docstring
glm should have been removed a long time ago, since it doesn't make much sense. a basic OLS class might not be bad for scipy, also from some of the questions that I have seen on stackoverflow of users that use the cookbook class.
I turned to statsmodels, which had great examples and it was very easy to use (for an ignoramus on the matter like myself).
But perhaps that happens to be an isolated point. I have to admit, I've just been using the pandas/statsmodels/sklearn combo directly. Part of that has to do also with the nice, long-form examples available for them, something which I think we still lack in numpy/scipy but where some of the new focused projects have done a great job (the matplotlib gallery blazed that trail, and others have followed with excellent results).
I'm not exactly unhappy about this :), especially once we get to the stage where you can type print modelresults.summary() and we print diagnostic checks why you shouldn't trust your model results, or we print no warning comments and the diagnostic checks don't indicate anything is wrong. Of course I'm not so happy about the lack of examples in scipy. Josef
Cheers,
f _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev