[SciPy-User] scipy.stats one-sided two-sided less, greater, signed ?

josef.pktd at gmail.com josef.pktd at gmail.com
Sun Jun 5 15:43:18 EDT 2011


What should be the policy on one-sided versus two-sided?

The main reason right now for looking at this is
http://projects.scipy.org/scipy/ticket/1394 which specifies a
"one-sided" alternative and provides both lower and upper tail.

I would prefer that we follow the alternative patterns similar to R

currently only kstest has    alternative : 'two_sided' (default),
'less' or 'greater'
but this should be added to other tests where it makes sense

R fisher.exact
"""alternative 	indicates the alternative hypothesis and must be one
of "two.sided", "greater" or "less". You can specify just the initial
letter. Only used in the 2 by 2 case."""

mannwhitneyu reports a one-sided test without actually specifying
which alternative is used  (I thought I remembered other cases like
this but don't find any right now)

related:
in many cases in the two-sided tests the test statistic has a sign
that indicates in which tail the test-statistic falls.
This is useful in ttests for example, because the one-sided tests can
be backed out from the two-sided tests. (With symmetric distributions
one-sided p-value is just half of the two-sided pvalue)

In the discussion of https://github.com/scipy/scipy/pull/8  I argued
that this might mislead users to interpret a two-sided result as a
one-sided result. However, I doubt now that this is a strong argument
against not reporting the signed test statistic.

After going through scipy.stats.stats, it looks like we always report
the signed test statistic.

The test statistic in ks_2samp is in all cases defined as a max value
and doesn't have a sign in R either, so adding a sign there would
break with the standard definition.
one-sided option for ks_2samp would just require to find the
distribution of the test statistics D+, D-

---

So my proposal for the general pattern (with exceptions for special
reasons) would be

* add/offer alternative : 'two_sided' (default), 'less' or 'greater'
http://projects.scipy.org/scipy/ticket/1394  for now,
and adjustments of existing tests in the future (adding the option can
be mostly done in a backwards compatible way and for symmetric
distributions like ttest it's just a convenience)
mannwhitneyu seems to be the only "weird" one

* report signed test statistic for two-sided alternative (when a
signed test statistic exists):  which is the status quo in
stats.stats, but I didn't know that this is actually pretty consistent
across tests.

Opinions ?

Josef



More information about the SciPy-User mailing list