[Numpy-discussion] NEP: Random Number Generator Policy

Sun Jun 3 23:20:23 EDT 2018

On Sun, Jun 3, 2018 at 6:54 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Sun, Jun 3, 2018 at 9:08 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
>> On Sun, Jun 3, 2018 at 5:46 PM <josef.pktd at gmail.com> wrote:
>>
>>>
>>>
>>> On Sun, Jun 3, 2018 at 8:21 PM, Robert Kern <robert.kern at gmail.com>
>>> wrote:
>>>
>>>>
>>>> The list of ``StableRandom`` methods should be chosen to support unit
>>>>> tests:
>>>>>
>>>>>     * ``.randint()``
>>>>>     * ``.uniform()``
>>>>>     * ``.normal()``
>>>>>     * ``.standard_normal()``
>>>>>     * ``.choice()``
>>>>>     * ``.shuffle()``
>>>>>     * ``.permutation()``
>>>>>
>>>>
>>>> https://github.com/numpy/numpy/pull/11229#discussion_r192604311
>>>> @bashtage writes:
>>>> > standard_gamma and standard_exponential are important enough to be
>>>> included here IMO.
>>>>
>>>> "Importance" was not my criterion, only whether they are used in unit
>>>> test suites. This list was just off the top of my head for methods that I
>>>> think were actually used in test suites, so I'd be happy to be shown live
>>>> tests that use other methods. I'd like to be a *little* conservative about
>>>> what methods we stick in here, but we don't have to be *too* conservative,
>>>> since we are explicitly never going to be modifying these.
>>>>
>>>
>>> That's one area where I thought the selection is too narrow.
>>> We should be able to get a stable stream from the uniform for some
>>> distributions.
>>>
>>> However, according to the Wikipedia description Poisson doesn't look
>>> easy. I just wrote a unit test for statsmodels using Poisson random numbers
>>> with hard coded numbers for the regression tests.
>>>
>>
>> I'd really rather people do this than use StableRandom; this is best
>> practice, as I see it, if your tests involve making precise comparisons to
>> expected results.
>>
>
> I hardcoded the results not the random data. So the unit tests rely on a
> reproducible stream of Poisson random numbers.
> I don't want to save 500 (100 or 1000) observations in a csv file for
> every variation of the unit test that I run.
>

I agree, hardcoding numbers in every place where seeded random numbers are
now used is quite unrealistic.

It may be worth having a look at test suites for scipy, statsmodels,
scikit-learn, etc. and estimate how much work this NEP causes those
projects. If the devs of those packages are forced to do large scale
migrations from RandomState to StableState, then why not instead keep
RandomState and just add a new API next to it?

Ralf

>
>
>>
>> StableRandom is intended as a crutch so that the pain of moving existing
>> unit tests away from the deprecated RandomState is less onerous. I'd really
>> rather people write better unit tests!
>>
>> In particular, I do not want to add any of the integer-domain
>> distributions (aside from shuffle/permutation/choice) as these are the ones
>> that have the platform-dependency issues with respect to 32/64-bit `long`
>> integers. They'd be unreliable for unit tests even if we kept them stable
>> over time.
>>
>>
>>> I'm not sure which other distributions are common enough and not easily
>>> reproducible by transformation. E.g. negative binomial can be reproduces by
>>> a gamma-poisson mixture.
>>>
>>> On the other hand normal can be easily recreated from standard_normal.
>>>
>>
>> I was mostly motivated by making it a bit easier to mechanically replace
>> uses of randn(), which is probably even more common than normal() and
>> standard_normal() in unit tests.
>>
>>
>>> Would it be difficult to keep this list large, given that it should be
>>> frozen, low maintenance code ?
>>>
>>
>> I admit that I had in mind non-statistical unit tests. That is, tests
>> that didn't depend on the precise distribution of the inputs.
>>
>
> The problem is that the unit test in `stats` rely on precise inputs (up to
> some numerical noise).
> For example p-values themselves are uniformly distributed if the
> hypothesis test works correctly. That mean if I don't have control over the
> inputs, then my p-value could be anything in (0, 1). So either we need a
> real dataset, save all the random numbers in a file or have a reproducible
> set of random numbers.
>
> 95% of the unit tests that I write are for statistics. A large fraction of
> them don't rely on the exact distribution, but do rely on a random numbers
> that are "good enough".
> For example, when writing unit test, then I get every once in a while or
> sometimes more often a "bad" stream of random numbers, for which
> convergence might fail or where the estimated numbers are far away from the
> true numbers, so test tolerance would have to be very high.
> If I pick one of the seeds that looks good, then I can have tighter unit
> test tolerance to insure results are good in a nice case.
>
> The problem is that we cannot write robust unit tests for regression tests
> without stable inputs.
> E.g. I verified my results with a Monte Carlo with 5000 replications and
> 1000 Poisson observations in each.
> Results look close to expected and won't depend much on the exact stream
> of random variables.
> But the Monte Carlo for each variant of the test took about 40 seconds.
> Doing this for all option combination and dataset specification takes too
> long to be feasible in a unit test suite.
> So I rely on numpy's stable random numbers and hard code the results for a
> specific random sample in the regression unit tests.
>
> Josef
>
>
>
>>
>> --
>> Robert Kern
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180603/cd8b099b/attachment-0001.html>