[SciPy-Dev] Exact p-values in Mann-Whitney U test

Szymon Łęski s.leski at nencki.gov.pl
Thu Mar 5 14:36:52 EST 2015


> IIUC "exact" in this context means running all possible re-orderings / exchanges of the data to estimate the null, e.g. for a paired t-test with 10 observations, doing 2 ** 10 permutations / sign flips.
> 
> Eric

That is correct. For Mann-Whitney there is a recursive formula, so you do not need to enumerate all possibilities explicitly. 
This method is used in commercial software, eg. Graphpad Prism (for samples < 100 elements):
http://www.graphpad.com/guides/prism/6/statistics/index.htm?how_the_mann-whitney_test_works.htm

Jamie, I tried MC to estimate p for Mann-Whitney test, but it was slower than the exact method. It might have been poor code, though... 
Thanks for the offer to take a look at a pull request, I will work on that. 

Szymon

> 
> 
> On Thu, Mar 5, 2015 at 8:06 AM, Sturla Molden <sturla.molden at gmail.com <mailto:sturla.molden at gmail.com>> wrote:
> Jamie Morton <jamietmorton at gmail.com <mailto:jamietmorton at gmail.com>> wrote:
> 
> > But perhaps having an exact p-value calculation for smaller sample sizes
> > would be preferable.
> 
> Monte Carlo randomization test is a good solution for rank-sum statistics.
> What does "exact" mean anyway?
> 
> MC tests are exact for a given number of significant digits. You just have
> to run it until the p-value has converged for the required number of
> significant digits.
> 
> Sturla

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150305/daaa2cbe/attachment.html>


More information about the SciPy-Dev mailing list