[SciPy-User] scipy.stats one-sided two-sided less, greater, signed ?
Bruce Southey
bsouthey at gmail.com
Mon Jun 13 12:18:17 EDT 2011
On 06/13/2011 02:46 AM, Ralf Gommers wrote:
>
>
> On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey <bsouthey at gmail.com
> <mailto:bsouthey at gmail.com>> wrote:
>
> On Sun, Jun 12, 2011 at 7:52 PM, <josef.pktd at gmail.com
> <mailto:josef.pktd at gmail.com>> wrote:
> > On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey
> <bsouthey at gmail.com <mailto:bsouthey at gmail.com>> wrote:
> >> On Sun, Jun 12, 2011 at 8:56 AM, <josef.pktd at gmail.com
> <mailto:josef.pktd at gmail.com>> wrote:
> >>> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey
> <bsouthey at gmail.com <mailto:bsouthey at gmail.com>> wrote:
> >>>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers
> >>>> <ralf.gommers at googlemail.com
> <mailto:ralf.gommers at googlemail.com>> wrote:
> >>>>>
> >>>>>
> >>>>> On Wed, Jun 8, 2011 at 12:56 PM, <josef.pktd at gmail.com
> <mailto:josef.pktd at gmail.com>> wrote:
> >>>>>>
> >>>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey
> <bsouthey at gmail.com <mailto:bsouthey at gmail.com>> wrote:
> >>>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers
> >>>>>> > <ralf.gommers at googlemail.com
> <mailto:ralf.gommers at googlemail.com>> wrote:
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, <josef.pktd at gmail.com
> <mailto:josef.pktd at gmail.com>> wrote:
> >>>>>> >>>
> >>>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey
> <bsouthey at gmail.com <mailto:bsouthey at gmail.com>>
> >>>>>> >>> wrote:
> >>>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com
> <mailto:josef.pktd at gmail.com> wrote:
> >>>>>> >>> >> What should be the policy on one-sided versus two-sided?
> >>>>>> >>> > Yes :-)
> >>>>>> >>> >
> >>>>>> >>> >> The main reason right now for looking at this is
> >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which
> specifies a
> >>>>>> >>> >> "one-sided" alternative and provides both lower and
> upper tail.
> >>>>>> >>> > That refers to the Fisher's test rather than the more
> 'traditional'
> >>>>>> >>> > one-sided tests. Each value of the Fisher's test has
> special
> >>>>>> >>> > meanings
> >>>>>> >>> > about the value or probability of the 'first cell'
> under the null
> >>>>>> >>> > hypothesis. So it is necessary to provide those
> three values.
> >>>>>> >>> >
> >>>>>> >>> >> I would prefer that we follow the alternative
> patterns similar to R
> >>>>>> >>> >>
> >>>>>> >>> >> currently only kstest has alternative :
> 'two_sided' (default),
> >>>>>> >>> >> 'less' or 'greater'
> >>>>>> >>> >> but this should be added to other tests where it
> makes sense
> >>>>>> >>> > I think that these Kolmogorov-Smirnov tests are not
> the traditional
> >>>>>> >>> > meaning either. It is a little mind-boggling to try
> to think about
> >>>>>> >>> > cdfs!
> >>>>>> >>> >
> >>>>>> >>> >> R fisher.exact
> >>>>>> >>> >> """alternative indicates the alternative
> hypothesis and must
> >>>>>> >>> >> be
> >>>>>> >>> >> one
> >>>>>> >>> >> of "two.sided", "greater" or "less". You can specify
> just the
> >>>>>> >>> >> initial
> >>>>>> >>> >> letter. Only used in the 2 by 2 case."""
> >>>>>> >>> >>
> >>>>>> >>> >> mannwhitneyu reports a one-sided test without
> actually specifying
> >>>>>> >>> >> which alternative is used (I thought I remembered
> other cases like
> >>>>>> >>> >> this but don't find any right now)
> >>>>>> >>> >>
> >>>>>> >>> >> related:
> >>>>>> >>> >> in many cases in the two-sided tests the test
> statistic has a sign
> >>>>>> >>> >> that indicates in which tail the test-statistic falls.
> >>>>>> >>> >> This is useful in ttests for example, because the
> one-sided tests
> >>>>>> >>> >> can
> >>>>>> >>> >> be backed out from the two-sided tests. (With symmetric
> >>>>>> >>> >> distributions
> >>>>>> >>> >> one-sided p-value is just half of the two-sided pvalue)
> >>>>>> >>> >>
> >>>>>> >>> >> In the discussion of
> https://github.com/scipy/scipy/pull/8 I
> >>>>>> >>> >> argued
> >>>>>> >>> >> that this might mislead users to interpret a
> two-sided result as a
> >>>>>> >>> >> one-sided result. However, I doubt now that this is
> a strong
> >>>>>> >>> >> argument
> >>>>>> >>> >> against not reporting the signed test statistic.
> >>>>>> >>> > (I do not follow pull requests so is there a relevant
> ticket?)
> >>>>>> >>> >
> >>>>>> >>> >> After going through scipy.stats.stats, it looks like
> we always
> >>>>>> >>> >> report
> >>>>>> >>> >> the signed test statistic.
> >>>>>> >>> >>
> >>>>>> >>> >> The test statistic in ks_2samp is in all cases
> defined as a max
> >>>>>> >>> >> value
> >>>>>> >>> >> and doesn't have a sign in R either, so adding a
> sign there would
> >>>>>> >>> >> break with the standard definition.
> >>>>>> >>> >> one-sided option for ks_2samp would just require to
> find the
> >>>>>> >>> >> distribution of the test statistics D+, D-
> >>>>>> >>> >>
> >>>>>> >>> >> ---
> >>>>>> >>> >>
> >>>>>> >>> >> So my proposal for the general pattern (with
> exceptions for special
> >>>>>> >>> >> reasons) would be
> >>>>>> >>> >>
> >>>>>> >>> >> * add/offer alternative : 'two_sided' (default),
> 'less' or
> >>>>>> >>> >> 'greater'
> >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 for now,
> >>>>>> >>> >> and adjustments of existing tests in the future
> (adding the option
> >>>>>> >>> >> can
> >>>>>> >>> >> be mostly done in a backwards compatible way and for
> symmetric
> >>>>>> >>> >> distributions like ttest it's just a convenience)
> >>>>>> >>> >> mannwhitneyu seems to be the only "weird" one
> >>>>>> >>
> >>>>>> >> This would actually make the fisher_exact implementation
> more
> >>>>>> >> consistent,
> >>>>>> >> since only one p-value is returned in all cases. I just
> don't like the
> >>>>>> >> R
> >>>>>> >> naming much; alternative="greater" does not convey to me
> that this is a
> >>>>>> >> one-sided test using the upper tail. How about:
> >>>>>> >> test : {"two-tailed", "lower-tail", "upper-tail"}
> >>>>>> >> with two-tailed the default?
> >>>>>>
> >>>>>> I think matlab uses (in general) larger and smaller, the
> advantage of
> >>>>>> less/smaller and greater/larger is that it directly refers
> to the
> >>>>>> alternative hypothesis, while the meaning in terms of tails
> is not
> >>>>>> always clear (in kstest and I guess some others the test
> statistics is
> >>>>>> just reversed and uses the same tail in both cases)
> >>>>>>
> >>>>>> so greater smaller is mostly "future proof" across tests, while
> >>>>>> reference to the tail can only be used where this is an
> unambiguous
> >>>>>> statement. but see below
> >>>>>>
> >>>>> I think I understand your terminology a bit better now, and
> consistency
> >>>>> across all tests is important. So I've updated the Fisher's
> exact patch to
> >>>>> use alternative={'two-sided', 'less', greater'} and sent a
> pull request:
> >>>>> https://github.com/scipy/scipy/pull/32
> >>>>>
> >>>>> Cheers,
> >>>>> Ralf
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> >>
> >>>>>> >> Ralf
> >>>>>> >>
> >>>>>> >>
> >>>>>> >>>
> >>>>>> >>> >>
> >>>>>> >>> >> * report signed test statistic for two-sided
> alternative (when a
> >>>>>> >>> >> signed test statistic exists): which is the status
> quo in
> >>>>>> >>> >> stats.stats, but I didn't know that this is actually
> pretty
> >>>>>> >>> >> consistent
> >>>>>> >>> >> across tests.
> >>>>>> >>> >>
> >>>>>> >>> >> Opinions ?
> >>>>>> >>> >>
> >>>>>> >>> >> Josef
> >>>>>> >>> >> _______________________________________________
> >>>>>> >>> >> SciPy-User mailing list
> >>>>>> >>> >> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>> >>> > I think that there is some valid misunderstanding
> here (as I was in
> >>>>>> >>> > the
> >>>>>> >>> > same situation) regarding what is meant here. My
> understanding is
> >>>>>> >>> > that
> >>>>>> >>> > under a one-sided hypothesis, all the values of the
> null hypothesis
> >>>>>> >>> > only
> >>>>>> >>> > exist in one tail of the test distribution. In
> contrast the values
> >>>>>> >>> > of
> >>>>>> >>> > null distribution exist in both tails with a
> two-sided hypothesis.
> >>>>>> >>> > Yet
> >>>>>> >>> > that interpretation does not have the same meaning as
> the tails in
> >>>>>> >>> > the
> >>>>>> >>> > Fisher or Kolmogorov-Smirnov tests.
> >>>>>> >>>
> >>>>>> >>> The tests have a clear Null Hypothesis (equality) and
> Alternative
> >>>>>> >>> Hypothesis (not equal or directional, less or greater).
> >>>>>> >>> So the "alternative" should be clearly specified in the
> function
> >>>>>> >>> argument, as in R.
> >>>>>> >>>
> >>>>>> >>> Whether this corresponds to left and right tails of the
> distribution
> >>>>>> >>> is an "implementation detail" which holds for ttests
> but not for
> >>>>>> >>> kstest/ks_2samp.
> >>>>>> >>>
> >>>>>> >>> kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 !=
> cdf2 or H1:
> >>>>>> >>> cdf1 < cdf2 or H1: cdf1 > cdf2
> >>>>>> >>> (looks similar to comparing two survival curves in
> Kaplan-Meier ?)
> >>>>>> >>>
> >>>>>> >>> fisher_exact (2 by 2) H0: odds-ratio == 1 and H1:
> odds-ratio != 1 or
> >>>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1
> >>>>>> >>>
> >>>>>> >>> I know the kolmogorov-smirnov tests, but for fisher
> exact and
> >>>>>> >>> contingency tables I rely on R
> >>>>>> >>>
> >>>>>> >>> from R-help:
> >>>>>> >>> For 2 by 2 tables, the null of conditional independence
> is equivalent
> >>>>>> >>> to the hypothesis that the odds ratio equals one. <...> The
> >>>>>> >>> alternative for a one-sided test is based on the odds
> ratio, so
> >>>>>> >>> alternative = "greater" is a test of the odds ratio
> being bigger than
> >>>>>> >>> or.
> >>>>>> >>> Two-sided tests are based on the probabilities of the
> tables, and take
> >>>>>> >>> as ‘more extreme’ all tables with probabilities less
> than or equal to
> >>>>>> >>> that of the observed table, the p-value being the sum
> of such
> >>>>>> >>> probabilities.
> >>>>>> >>>
> >>>>>> >>> Josef
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>> >
> >>>>>> >>> > I never paid much attention to the frequency based
> tests but it does
> >>>>>> >>> > not
> >>>>>> >>> > surprise if there are no one-sided tests. Most are
> rank-based so it
> >>>>>> >>> > is
> >>>>>> >>> > rather hard to do in a simply manner - actually I am
> not even sure
> >>>>>> >>> > how
> >>>>>> >>> > to use a permutation test.
> >>>>>> >>> >
> >>>>>> >>> > Bruce
> >>>>>> >>> >
> >>>>>> >>> >
> >>>>>> >>> >
> >>>>>> >>> > _______________________________________________
> >>>>>> >>> > SciPy-User mailing list
> >>>>>> >>> > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>> >>> >
> >>>>>> >>> _______________________________________________
> >>>>>> >>> SciPy-User mailing list
> >>>>>> >>> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> _______________________________________________
> >>>>>> >> SciPy-User mailing list
> >>>>>> >> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>> >>
> >>>>>> >>
> >>>>>> >
> >>>>>> > But that is NOT the correct interpretation here!
> >>>>>> > I tried to explain to you that this is the not the usual idea
> >>>>>> > one-sided vs two-sided tests.
> >>>>>> > For example:
> >>>>>> >
> http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt
> <http://www.msu.edu/%7Efuw/teaching/Fu_ch10_2_categorical.ppt>
> >>>>>> > "The test holds the marginal totals fixed and computes the
> >>>>>> > hypergeometric probability that n11 is at least as large
> as the
> >>>>>> > observed value"
> >>>>>>
> >>>>>> this still sounds like a less/greater test to me
> >>>>>>
> >>>>>>
> >>>>>> > "The output consists of three p-values:
> >>>>>> > Left: Use this when the alternative to independence is
> that there is
> >>>>>> > negative association between the variables. That is, the
> observations
> >>>>>> > tend to lie in lower left and upper right.
> >>>>>> > Right: Use this when the alternative to independence is
> that there is
> >>>>>> > positive association between the variables. That is, the
> observations
> >>>>>> > tend to lie in upper left and lower right.
> >>>>>> > 2-Tail: Use this when there is no prior alternative.
> >>>>>> > "
> >>>>>> > There is also the book "Categorical data analysis: using
> the SAS
> >>>>>> > system By Maura E. Stokes, Charles S. Davis, Gary G.
> Koch" that came
> >>>>>> > up via Google that also refers to the n11 cell.
> >>>>>> >
> >>>>>> > http://www.langsrud.com/fisher.htm
> >>>>>>
> >>>>>> I was trying to read the Agresti paper referenced there but
> it has too
> >>>>>> much detail to get through in 15 minutes :)
> >>>>>>
> >>>>>> > "The output consists of three p-values:
> >>>>>> >
> >>>>>> > Left: Use this when the alternative to independence is
> that there
> >>>>>> > is negative association between the variables.
> >>>>>> > That is, the observations tend to lie in lower left
> and upper right.
> >>>>>> > Right: Use this when the alternative to independence
> is that there
> >>>>>> > is positive association between the variables.
> >>>>>> > That is, the observations tend to lie in upper left
> and lower right.
> >>>>>> > 2-Tail: Use this when there is no prior alternative.
> >>>>>> >
> >>>>>> > NOTE: Decide to use Left, Right or 2-Tail before
> collecting (or
> >>>>>> > looking at) the data."
> >>>>>> >
> >>>>>> > But you will get a different p-value if you switch rows
> and columns
> >>>>>> > because of the dependence on the n11 cell. If you do that
> then the
> >>>>>> > p-values switch between left and right sides as these now
> refer to
> >>>>>> > different hypotheses regarding that first cell.
> >>>>>>
> >>>>>> switching row and columns doesn't change the p-value in R
> >>>>>> reversing columns changes the definition of less and
> greater, reverses
> >>>>>> them
> >>>>>>
> >>>>>> The problem with 2 by 2 contingency tables with given
> marginals, i.e.
> >>>>>> row and column totals, is that we only have one free entry.
> Any test
> >>>>>> on one entry, e.g. element 0,0, pins down all the other
> ones and
> >>>>>> (many) tests then become equivalent.
> >>>>>>
> >>>>>>
> >>>>>>
> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm
> >>>>>> some math got lost
> >>>>>> """
> >>>>>> For <2 by 2> tables, one-sided -values for Fisher’s exact
> test are
> >>>>>> defined in terms of the frequency of the cell in the first
> row and
> >>>>>> first column of the table, the (1,1) cell. Denoting the
> observed (1,1)
> >>>>>> cell frequency by , the left-sided -value for Fisher’s
> exact test is
> >>>>>> the probability that the (1,1) cell frequency is less than
> or equal to
> >>>>>> . For the left-sided -value, the set includes those tables
> with a
> >>>>>> (1,1) cell frequency less than or equal to . A small
> left-sided -value
> >>>>>> supports the alternative hypothesis that the probability of an
> >>>>>> observation being in the first cell is actually less than
> expected
> >>>>>> under the null hypothesis of independent row and column
> variables.
> >>>>>>
> >>>>>> Similarly, for a right-sided alternative hypothesis, is the
> set of
> >>>>>> tables where the frequency of the (1,1) cell is greater
> than or equal
> >>>>>> to that in the observed table. A small right-sided -value
> supports the
> >>>>>> alternative that the probability of the first cell is
> actually greater
> >>>>>> than that expected under the null hypothesis.
> >>>>>>
> >>>>>> Because the (1,1) cell frequency completely determines the
> table when
> >>>>>> the marginal row and column sums are fixed, these one-sided
> >>>>>> alternatives can be stated equivalently in terms of other cell
> >>>>>> probabilities or ratios of cell probabilities. The left-sided
> >>>>>> alternative is equivalent to an odds ratio less than 1,
> where the odds
> >>>>>> ratio equals (). Additionally, the left-sided alternative is
> >>>>>> equivalent to the column 1 risk for row 1 being less than
> the column 1
> >>>>>> risk for row 2, . Similarly, the right-sided alternative is
> equivalent
> >>>>>> to the column 1 risk for row 1 being greater than the
> column 1 risk
> >>>>>> for row 2, . See Agresti (2007) for details.
> >>>>>> R C Tables
> >>>>>> """
> >>>>>>
> >>>>>> I'm not a user of Fisher's exact test (and I have a hard
> time keeping
> >>>>>> the different statements straight), so if left/right or
> lower/upper
> >>>>>> makes more sense to users, then I don't complain.
> >>>>>>
> >>>>>> To me they are all just independence tests with possible
> one-sided
> >>>>>> alternatives that one distribution dominates the other.
> (with the same
> >>>>>> pattern as ks_2samp or ttest_2samp)
> >>>>>>
> >>>>>> Josef
> >>>>>>
> >>>>>> >
> >>>>>> >
> >>>>>> > Bruce
> >>>>>> > _______________________________________________
> >>>>>> > SciPy-User mailing list
> >>>>>> > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>> >
> >>>>>> _______________________________________________
> >>>>>> SciPy-User mailing list
> >>>>>> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> SciPy-User mailing list
> >>>>> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>
> >>>>>
> >>>> This is just wrong and plain ignorant! Please read the
> references and
> >>>> stats books about what the tails actually mean!
> >>>>
> >>>> You really need all three tests because these have different
> meanings
> >>>> that you do not know in advance which you need.
> >>>
> >>> Sorry, but I'm perfectly happy to follow R and SAS in this.
> >>>
> >>> Josef
> >>>
> >>>>
> >>>> Bruce
> >>>> _______________________________________________
> >>>> SciPy-User mailing list
> >>>> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>
> >>> _______________________________________________
> >>> SciPy-User mailing list
> >>> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>
> >> So am I which is NOT what is happening here!
> >
> > Why do you think that?
> Because all the stuff given above including SAS which YOU provided
> includes all three tests.
>
> > I quoted all the relevant descriptions from the R and SAS help,
> and I
> > checked the following and similar for the cases that are in the
> > changeset for the tests:
> >
> >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g')
> >
> > Fisher's Exact Test for Count Data
> >
> > data: t(matrix(c(190, 800, 200, 900), nrow = 2))
> > p-value = 0.296
> > alternative hypothesis: true odds ratio is greater than 1
> > 95 percent confidence interval:
> > 0.8828407 Inf
> > sample estimates:
> > odds ratio
> > 1.068698
> >
> >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l')
> >
> > Fisher's Exact Test for Count Data
> >
> > data: t(matrix(c(190, 800, 200, 900), nrow = 2))
> > p-value = 0.7416
> > alternative hypothesis: true odds ratio is less than 1
> > 95 percent confidence interval:
> > 0.000000 1.293552
> > sample estimates:
> > odds ratio
> > 1.068698
> >
> >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t')
> >
> > Fisher's Exact Test for Count Data
> >
> > data: t(matrix(c(190, 800, 200, 900), nrow = 2))
> > p-value = 0.5741
> > alternative hypothesis: true odds ratio is not equal to 1
> > 95 percent confidence interval:
> > 0.8520463 1.3401490
> > sample estimates:
> > odds ratio
> > 1.068698
> >
> > All the p-values agree for the alternatives two-sided, less, and
> > greater, the odds ratio is defined differently as explained pretty
> > well in the docstring.
> >
> > Josef
> >
> >
> >>
> >> Bruce
> >> _______________________________________________
> >> SciPy-User mailing list
> >> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
>
> Yes, but you said to follow BOTH R and SAS - that means providing
> all three:
>
> The FREQ Procedure
>
> Table of Exposure by Response
>
> Exposure Response
>
> Frequency| 0| 1| Total
> ---------+--------+--------+
> 0 | 190 | 800 | 990
> ---------+--------+--------+
> 1 | 200 | 900 | 1100
> ---------+--------+--------+
> Total 390 1700 2090
>
>
> Statistics for Table of Exposure by Response
>
> Statistic DF Value Prob
> ------------------------------------------------------
> Chi-Square 1 0.3503 0.5540
> Likelihood Ratio Chi-Square 1 0.3500 0.5541
> Continuity Adj. Chi-Square 1 0.2869 0.5922
> Mantel-Haenszel Chi-Square 1 0.3501 0.5541
> Phi Coefficient 0.0129
> Contingency Coefficient 0.0129
> Cramer's V 0.0129
>
>
> Pearson Chi-Square Test
> ----------------------------------
> Chi-Square 0.3503
> DF 1
> Asymptotic Pr > ChiSq 0.5540
> Exact Pr >= ChiSq 0.5741
>
>
> Fisher's Exact Test
> ----------------------------------
> Cell (1,1) Frequency (F) 190
> Left-sided Pr <= F 0.7416
> Right-sided Pr >= F 0.2960
>
> Table Probability (P) 0.0376
> Two-sided Pr <= P 0.5741
>
> Sample Size = 2090
>
> Thus providing all three is the correct answer.
>
> Eh, we do. The interface is the same as that of R, and all three of
> {two-sided, less, greater} are extensively checked against R. It looks
> like you are reacting to only one statement Josef made to explain his
> interpretation of less/greater. Please check the actual commit and
> then comment if you see anything wrong.
>
> Ralf
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
I have looked at it (again) and the comments still stand:
A user should not have to read a statistical book and then the code to
figure out what was actually implemented here. So I do strongly object
to Josef's statements as you just can not interpret Fisher's test in
that way. Just look at how SAS presents the results as should give a
huge clue that the two-sided tests is different than the other one-sided
tests.
Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110613/365d0881/attachment.html>
More information about the SciPy-User
mailing list