[Numpy-discussion] NEP: Random Number Generator Policy
robert.kern at gmail.com
Sun Jun 3 21:08:38 EDT 2018
On Sun, Jun 3, 2018 at 5:46 PM <josef.pktd at gmail.com> wrote:
> On Sun, Jun 3, 2018 at 8:21 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> The list of ``StableRandom`` methods should be chosen to support unit
>>> * ``.randint()``
>>> * ``.uniform()``
>>> * ``.normal()``
>>> * ``.standard_normal()``
>>> * ``.choice()``
>>> * ``.shuffle()``
>>> * ``.permutation()``
>> @bashtage writes:
>> > standard_gamma and standard_exponential are important enough to be
>> included here IMO.
>> "Importance" was not my criterion, only whether they are used in unit
>> test suites. This list was just off the top of my head for methods that I
>> think were actually used in test suites, so I'd be happy to be shown live
>> tests that use other methods. I'd like to be a *little* conservative about
>> what methods we stick in here, but we don't have to be *too* conservative,
>> since we are explicitly never going to be modifying these.
> That's one area where I thought the selection is too narrow.
> We should be able to get a stable stream from the uniform for some
> However, according to the Wikipedia description Poisson doesn't look easy.
> I just wrote a unit test for statsmodels using Poisson random numbers with
> hard coded numbers for the regression tests.
I'd really rather people do this than use StableRandom; this is best
practice, as I see it, if your tests involve making precise comparisons to
StableRandom is intended as a crutch so that the pain of moving existing
unit tests away from the deprecated RandomState is less onerous. I'd really
rather people write better unit tests!
In particular, I do not want to add any of the integer-domain distributions
(aside from shuffle/permutation/choice) as these are the ones that have the
platform-dependency issues with respect to 32/64-bit `long` integers.
They'd be unreliable for unit tests even if we kept them stable over time.
> I'm not sure which other distributions are common enough and not easily
> reproducible by transformation. E.g. negative binomial can be reproduces by
> a gamma-poisson mixture.
> On the other hand normal can be easily recreated from standard_normal.
I was mostly motivated by making it a bit easier to mechanically replace
uses of randn(), which is probably even more common than normal() and
standard_normal() in unit tests.
> Would it be difficult to keep this list large, given that it should be
> frozen, low maintenance code ?
I admit that I had in mind non-statistical unit tests. That is, tests that
didn't depend on the precise distribution of the inputs.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion