[Numpy-discussion] NEP: Random Number Generator Policy

Sun Jun 3 21:52:19 EDT 2018

On Sun, Jun 3, 2018 at 6:26 PM <josef.pktd at gmail.com> wrote:

>
>
> On Sun, Jun 3, 2018 at 9:04 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
>> On Sun, Jun 3, 2018 at 6:01 PM <josef.pktd at gmail.com> wrote:
>>
>>>
>>>
>>> On Sun, Jun 3, 2018 at 8:36 PM, Robert Kern <robert.kern at gmail.com>
>>> wrote:
>>>
>>>> On Sun, Jun 3, 2018 at 4:35 PM Eric Wieser <wieser.eric+numpy at gmail.com>
>>>> wrote:
>>>>
>>>>> You make a bunch of good points refuting reproducible research as an
>>>>> argument for not changing the random number streams.
>>>>>
>>>>> However, there’s a second use-case you don’t address - unit tests. For
>>>>> better or worse, downstream, or even our own
>>>>> <https://github.com/numpy/numpy/blob/c4813a9/numpy/core/tests/test_multiarray.py#L5093-L5108>,
>>>>> unit tests use a seeded random number generator as a shorthand to produce
>>>>> some arbirary array, and then hard-code the expected output in their tests.
>>>>> Breaking stream compatibility will break these tests.
>>>>>
>>>> By the way, the reason that I didn't mention this use case as a
>>>> motivation in the Status Quo section because, as I reviewed my mail
>>>> archive, this wasn't actually a motivating use case for the policy. It's
>>>> certainly a use case that developed once we did make these
>>>> (*cough*extravagant*cough*) guarantees, though, as people started to rely
>>>> on it, and I hope that my StableRandom proposal addresses it to your
>>>> satisfaction. I could add some more details about that history if you
>>>> think it would be useful.
>>>>
>>>
>>> I don't think that's accurate.
>>> The unit tests for stable random numbers were added when Enthought
>>> silently changed the normal random numbers and we got messages from users
>>> that the unit tests fail and they cannot reproduce our results.
>>>
>>> 6/12/10
>>> [SciPy-Dev] seeded randn gets different values on osx
>>>
>>> (I don't find an online copy, this is from my own mail archive)
>>>
>>
>> The policy was in place Nov 2008.
>>
>
> only for the underlying stream, but those unit tests didn't guarantee it
> for the actual distributions
>
> https://github.com/numpy/numpy/commit/898e6bdc625cdd3c97865ef99f8d51c5f43eafff
>
> So maybe there was a discussion in 2008 which was mostly before my time.
> The guarantee for distributions was added in 2010/2011, at least in terms
> of unit tests in numpy
> in order to protect the unit tests in scipy.stats and by analogy for
> similar cases in other packages
> and across users.
>

The policy existed for the distributions regardless of whether or not we
had a test suite that ensured it. I cannot share internal emails, of
course, but please be assured that the existence of the policy was one of
my arguments for rolling back that addition to EPD (and would have been
what I argued to prevent it from going out, had I been aware of it).

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180603/7e80b50a/attachment.html>