[SciPy-Dev] Exact p-values in Mann-Whitney U test

Thu Mar 5 09:45:13 EST 2015

Hello,

I wrote a Python implementation of exact p-values in Mann-Whitney U test. The current test (scipy.stats.mannwhitneyu) uses normal approximation, and is valid only for sample size > 20 (as stated in notes). The exact version is correct also for small samples. 

I believe this would be a useful thing to include in scipy.stats. However, the current version is still better for very large samples, so I think both versions should be kept. I wanted to ask for opinion on what would be the best way to include the new version. 
Separate function? Optional argument controlling which method is used? Heuristics based on sample sizes?

I have put my script, and the paper I based the implementation on, in this Dropbox folder:
https://www.dropbox.com/sh/0zxp9u8sliwijl5/AAARecyrwQ2z-8xU-LbKOpWna?dl=0

Feedback appreciated!

Best regards,
Szymon Leski