[SciPy-User] stats.ranksums vs. stats.mannwhitneyu

Tue Oct 9 12:13:00 EDT 2012

I am trying to perform a Mann-Whitney U (AKA rank sum) test using
Scipy. My data consists of around 30 samples in total with ties, so I
get anything between 1:29 .. 15:15 .. 29:1 samples per group.

As far as I can see there are two options:

scipy.stats.ranksums: Does not handle ties, equivalent to R's
wilcox.test with exact=False and correct=False
scipy.stats.mannwhitneyu: Handles ties, equivalent to R's wilcox.test
with exact=False and correct=use_continuity

So at first glance the MWU function would seem to be the better
choice, except the docs explicitly state that it should not be used
with less than 20 samples per group.

So what is the best function to use in this case? What kind of biases
will I get when I use the mannwhitneyu function with less than 20
samples? And what sort of problems do ties cause with ranksums?

Cheers

Nils