[SciPy-Dev] chi-square test for a contingency (R x C) table

Neil Martinsen-Burrell nmb at wartburg.edu
Wed Jun 2 12:26:04 EDT 2010


On 2010-06-02 11:02 , Bruce Southey wrote:
> On 06/02/2010 09:37 AM, josef.pktd at gmail.com wrote:
>> On Wed, Jun 2, 2010 at 8:24 AM, Neil Martinsen-Burrell<nmb at wartburg.edu>  wrote:
>>
>>> On 2010-06-01 23:28 , Warren Weckesser wrote:
>>>
>>>> I've been digging into some basic statistics recently, and developed the
>>>> following function for applying the chi-square test to a contingency
>>>> table.  Does something like this already exist in scipy.stats? If not,
>>>> any objects to adding it?  (Tests are already written :)
>>>>
>>> Something like this would be great in scipy.stats since I end up doing
>>> the exact same thing by hand whenever I grade introductory statistics
>>> exams.  Thanks for writing this!
>>>
> You might find SAS helpful:
> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#/documentation/cdl/en/procstat/63104/HTML/default/freq_toc.htm

I'm not sure what you mean by this.  I have no problem performing the 
test, it's just inconvenient that it isn't already a part of scipy.stats

> However, this code is the chi-squared test part as SAS will compute the
> actual cell numbers. Also an extension to scipy.stats.chisquare() so we
> can not have both functions.

Again, I don't understand what you mean that we can't have both 
functions?  I believe (from a statistics teacher's point of view) that 
the Chi-Squared goodness of fit test (which is stats.chisquare) is a 
different beast from the Chi-Square test for independence (which is 
stats.chisquare_contingency).  The fact that the distribution of the 
test statistic is the same should not tempt us to put them into the same 
function.

> Really this should be combined with fisher.py in ticket 956:
> http://projects.scipy.org/scipy/ticket/956

Wow, apparently I have lots of disagreements today, but I don't think 
that this should be combined with Fisher's Exact test.  (I would like to 
see that ticket mature to the point where it can be added to 
scipy.stats.)  I like the functions in scipy.stats to correspond in a 
one-to-one manner with the statistical tests.  I think that the docs 
should "See Also" the appropriate exact (and non-parametric) tests, but 
I think that one function/one test is a good rule.  This is particularly 
true for people (like me) who would like to someday be able to use 
scipy.stats in a pedagogical context.

-Neil



More information about the SciPy-Dev mailing list