
Hi all, In scipy.stats there are three functions that calculate various F-statistics for inputs obtained from univariate or multivariate ANOVA. These are f_value, f_value_multivariate and f_value_wilks_lambda: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683 The problem with those is that they're not very useful standalone. f_value implements a statistic that's also calculated and given as a return by f_oneway (which does one-way ANOVA). The other two functions are related to multivariate ANOVA, for which scipy.stats doesn't provide any functionality. At the moment Statsmodels provides a lot more ANOVA functionality than scipy.stats does, and I agree with Josef [1, 2] that adding new functionality in this area to Statsmodels would fit better than adding it to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be added to scipy.stats. That could be added to Statsmodels instead (my preference). If we do want to add it to Scipy, we need to have a clear list of what else is needed to create a coherent set of functions in this area. Thoughts? Ralf [1] https://github.com/scipy/scipy/issues/650 [2] https://github.com/scipy/scipy/issues/660 [3] https://github.com/scipy/scipy/issues/4913

You can find the corresponding PR here: gh-4968 <https://github.com/scipy/scipy/pull/4968> Cheers, Abraham. 2015-06-14 15:50 GMT-05:00 Ralf Gommers <ralf.gommers@gmail.com>:
Hi all,
In scipy.stats there are three functions that calculate various F-statistics for inputs obtained from univariate or multivariate ANOVA. These are f_value, f_value_multivariate and f_value_wilks_lambda: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683
The problem with those is that they're not very useful standalone. f_value implements a statistic that's also calculated and given as a return by f_oneway (which does one-way ANOVA). The other two functions are related to multivariate ANOVA, for which scipy.stats doesn't provide any functionality.
At the moment Statsmodels provides a lot more ANOVA functionality than scipy.stats does, and I agree with Josef [1, 2] that adding new functionality in this area to Statsmodels would fit better than adding it to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be added to scipy.stats. That could be added to Statsmodels instead (my preference). If we do want to add it to Scipy, we need to have a clear list of what else is needed to create a coherent set of functions in this area.
Thoughts?
Ralf
[1] https://github.com/scipy/scipy/issues/650 [2] https://github.com/scipy/scipy/issues/660 [3] https://github.com/scipy/scipy/issues/4913
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

I agree that it makes sense to move statistical testing code to statsmodels. From what I understand, the space of functions is probably too large for scipy to reasonably take on, and such functions seem likely to get more attention from the statsmodels folks. Eric On Tue, Jun 16, 2015 at 5:03 PM, Abraham Escalante <aeklant@gmail.com> wrote:
You can find the corresponding PR here: gh-4968 <https://github.com/scipy/scipy/pull/4968>
Cheers, Abraham.
2015-06-14 15:50 GMT-05:00 Ralf Gommers <ralf.gommers@gmail.com>:
Hi all,
In scipy.stats there are three functions that calculate various F-statistics for inputs obtained from univariate or multivariate ANOVA. These are f_value, f_value_multivariate and f_value_wilks_lambda: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683
The problem with those is that they're not very useful standalone. f_value implements a statistic that's also calculated and given as a return by f_oneway (which does one-way ANOVA). The other two functions are related to multivariate ANOVA, for which scipy.stats doesn't provide any functionality.
At the moment Statsmodels provides a lot more ANOVA functionality than scipy.stats does, and I agree with Josef [1, 2] that adding new functionality in this area to Statsmodels would fit better than adding it to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be added to scipy.stats. That could be added to Statsmodels instead (my preference). If we do want to add it to Scipy, we need to have a clear list of what else is needed to create a coherent set of functions in this area.
Thoughts?
Ralf
[1] https://github.com/scipy/scipy/issues/650 [2] https://github.com/scipy/scipy/issues/660 [3] https://github.com/scipy/scipy/issues/4913
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On Thu, Jun 18, 2015 at 1:34 PM, Eric Larson <larson.eric.d@gmail.com> wrote:
I agree that it makes sense to move statistical testing code to statsmodels. From what I understand, the space of functions is probably too large for scipy to reasonably take on, and such functions seem likely to get more attention from the statsmodels folks.
To clarify a bit the current situation, or make it more explicit: It's house cleaning time in scipy.stats. And the main question is whether to drop some functions that have accumulated in the past but have essentially lost their purpose within scipy.stata. so there are essentially two option 1) deprecate and delete those function, or 2) expand on them so they become useful again. The general opinion (or at least Ralf's and mine and nobody else complained) is that new functionality that is not closely related to the good stuff in scipy stats should go to statsmodels. However, there are currently no plans to move the "good stuff" in scipy.stats to statsmodels. scipy.stats has a set of good library functions that remain in scipy, get improved and enhanced. Also, scipy.stats has more code reviewers than statsmodels (and the main code reviewer of statsmodels gets to easily distracted with weird things. :). Josef
Eric
On Tue, Jun 16, 2015 at 5:03 PM, Abraham Escalante <aeklant@gmail.com> wrote:
You can find the corresponding PR here: gh-4968 <https://github.com/scipy/scipy/pull/4968>
Cheers, Abraham.
2015-06-14 15:50 GMT-05:00 Ralf Gommers <ralf.gommers@gmail.com>:
Hi all,
In scipy.stats there are three functions that calculate various F-statistics for inputs obtained from univariate or multivariate ANOVA. These are f_value, f_value_multivariate and f_value_wilks_lambda: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683
The problem with those is that they're not very useful standalone. f_value implements a statistic that's also calculated and given as a return by f_oneway (which does one-way ANOVA). The other two functions are related to multivariate ANOVA, for which scipy.stats doesn't provide any functionality.
At the moment Statsmodels provides a lot more ANOVA functionality than scipy.stats does, and I agree with Josef [1, 2] that adding new functionality in this area to Statsmodels would fit better than adding it to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be added to scipy.stats. That could be added to Statsmodels instead (my preference). If we do want to add it to Scipy, we need to have a clear list of what else is needed to create a coherent set of functions in this area.
Thoughts?
Ralf
[1] https://github.com/scipy/scipy/issues/650 [2] https://github.com/scipy/scipy/issues/660 [3] https://github.com/scipy/scipy/issues/4913
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On Thu, Jun 18, 2015 at 8:10 PM, <josef.pktd@gmail.com> wrote:
On Thu, Jun 18, 2015 at 1:34 PM, Eric Larson <larson.eric.d@gmail.com> wrote:
I agree that it makes sense to move statistical testing code to statsmodels. From what I understand, the space of functions is probably too large for scipy to reasonably take on, and such functions seem likely to get more attention from the statsmodels folks.
To clarify a bit the current situation, or make it more explicit:
It's house cleaning time in scipy.stats. And the main question is whether to drop some functions that have accumulated in the past but have essentially lost their purpose within scipy.stata.
so there are essentially two option
1) deprecate and delete those function, or 2) expand on them so they become useful again.
The general opinion (or at least Ralf's and mine and nobody else complained) is that new functionality that is not closely related to the good stuff in scipy stats should go to statsmodels.
However, there are currently no plans to move the "good stuff" in scipy.stats to statsmodels. scipy.stats has a set of good library functions that remain in scipy, get improved and enhanced.
Also, scipy.stats has more code reviewers than statsmodels (and the main code reviewer of statsmodels gets to easily distracted with weird things. :).
Thanks for the summary Josef. To expand on the above a little bit: I'm not opposed to adding new functionality to scipy.stats, however what I'd like to avoid is to add a single function simply because it looks OK and is statistics-related. We should decide case by case whether or not an area of statistics makes sense to add and if it does, then add a coherent set of functionality. Ralf
participants (4)
-
Abraham Escalante
-
Eric Larson
-
josef.pktd@gmail.com
-
Ralf Gommers