[SciPy-Dev] proper way to test distributions

Vincent Davis vincent at vincentdavis.net
Mon Jun 14 23:07:18 EDT 2010


I was reviewing the how tests of distribution where done in scipy with
the thought of applying the same methods to numpy.random. I have a lot
to learn here and appreciate you suggestions.

Link to the scipy test
http://github.com/pv/scipy-work/blob/master/scipy/stats/tests/test_continuous_basic.py

If I understand correctly the tests create a sample of 2000 from a
given distribution and the compares stats (mean, var...) calculate
with functions from numpy with those stored in the distribution
instant .stats  I am not sure how the mean is calculated within the
distribution (is it just using the scipy mean)  Anyway this seems a
little circular.

Maybe I am missing something but here are my thought.

1) Using seed() and the comparing the actual results (arrays) helps to
make sure the code is stable but tells you nothing about the quality
of the distribution.

2) Using seed() and the calculating the moments (with numpy and
dist.stats) is not really any different that (1)

3) drawing a large sample (possibly using seed()) and calculating the
moments and comparing the to the theoretical moments seems like the
best option. But this could be slow.

What is the best way?
What is desired in numpy?

And a little off topic but isn't numpy.random duplicating scipy or
scipy duplicating numpy?

Thanks
Vincent



More information about the SciPy-Dev mailing list