[Numpy-discussion] NEP: Random Number Generator Policy

josef.pktd at gmail.com josef.pktd at gmail.com
Sun Jun 3 21:54:03 EDT 2018

On Sun, Jun 3, 2018 at 9:08 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jun 3, 2018 at 5:46 PM <josef.pktd at gmail.com> wrote:
>> On Sun, Jun 3, 2018 at 8:21 PM, Robert Kern <robert.kern at gmail.com>
>> wrote:
>>> The list of ``StableRandom`` methods should be chosen to support unit
>>>> tests:
>>>>     * ``.randint()``
>>>>     * ``.uniform()``
>>>>     * ``.normal()``
>>>>     * ``.standard_normal()``
>>>>     * ``.choice()``
>>>>     * ``.shuffle()``
>>>>     * ``.permutation()``
>>> https://github.com/numpy/numpy/pull/11229#discussion_r192604311
>>> @bashtage writes:
>>> > standard_gamma and standard_exponential are important enough to be
>>> included here IMO.
>>> "Importance" was not my criterion, only whether they are used in unit
>>> test suites. This list was just off the top of my head for methods that I
>>> think were actually used in test suites, so I'd be happy to be shown live
>>> tests that use other methods. I'd like to be a *little* conservative about
>>> what methods we stick in here, but we don't have to be *too* conservative,
>>> since we are explicitly never going to be modifying these.
>> That's one area where I thought the selection is too narrow.
>> We should be able to get a stable stream from the uniform for some
>> distributions.
>> However, according to the Wikipedia description Poisson doesn't look
>> easy. I just wrote a unit test for statsmodels using Poisson random numbers
>> with hard coded numbers for the regression tests.
> I'd really rather people do this than use StableRandom; this is best
> practice, as I see it, if your tests involve making precise comparisons to
> expected results.

I hardcoded the results not the random data. So the unit tests rely on a
reproducible stream of Poisson random numbers.
I don't want to save 500 (100 or 1000) observations in a csv file for every
variation of the unit test that I run.

> StableRandom is intended as a crutch so that the pain of moving existing
> unit tests away from the deprecated RandomState is less onerous. I'd really
> rather people write better unit tests!
> In particular, I do not want to add any of the integer-domain
> distributions (aside from shuffle/permutation/choice) as these are the ones
> that have the platform-dependency issues with respect to 32/64-bit `long`
> integers. They'd be unreliable for unit tests even if we kept them stable
> over time.
>> I'm not sure which other distributions are common enough and not easily
>> reproducible by transformation. E.g. negative binomial can be reproduces by
>> a gamma-poisson mixture.
>> On the other hand normal can be easily recreated from standard_normal.
> I was mostly motivated by making it a bit easier to mechanically replace
> uses of randn(), which is probably even more common than normal() and
> standard_normal() in unit tests.
>> Would it be difficult to keep this list large, given that it should be
>> frozen, low maintenance code ?
> I admit that I had in mind non-statistical unit tests. That is, tests that
> didn't depend on the precise distribution of the inputs.

The problem is that the unit test in `stats` rely on precise inputs (up to
some numerical noise).
For example p-values themselves are uniformly distributed if the hypothesis
test works correctly. That mean if I don't have control over the inputs,
then my p-value could be anything in (0, 1). So either we need a real
dataset, save all the random numbers in a file or have a reproducible set
of random numbers.

95% of the unit tests that I write are for statistics. A large fraction of
them don't rely on the exact distribution, but do rely on a random numbers
that are "good enough".
For example, when writing unit test, then I get every once in a while or
sometimes more often a "bad" stream of random numbers, for which
convergence might fail or where the estimated numbers are far away from the
true numbers, so test tolerance would have to be very high.
If I pick one of the seeds that looks good, then I can have tighter unit
test tolerance to insure results are good in a nice case.

The problem is that we cannot write robust unit tests for regression tests
without stable inputs.
E.g. I verified my results with a Monte Carlo with 5000 replications and
1000 Poisson observations in each.
Results look close to expected and won't depend much on the exact stream of
random variables.
But the Monte Carlo for each variant of the test took about 40 seconds.
Doing this for all option combination and dataset specification takes too
long to be feasible in a unit test suite.
So I rely on numpy's stable random numbers and hard code the results for a
specific random sample in the regression unit tests.


> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180603/cc0ffe7d/attachment-0001.html>

More information about the NumPy-Discussion mailing list