[Numpy-discussion] NEP: Random Number Generator Policy

Charles R Harris charlesr.harris at gmail.com
Mon Jun 4 01:26:08 EDT 2018

On Sun, Jun 3, 2018 at 11:03 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jun 3, 2018 at 9:24 PM Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>> On Sat, Jun 2, 2018 at 1:04 PM, Robert Kern <robert.kern at gmail.com>
>> wrote:
>>> This policy was first instated in Nov 2008 (in essence; the full set of
>>> weasel
>> Instituted?
> I meant "instated"; c.f. for another usage: https://www.youredm.com/2018/
> 06/01/spotify-new-policy-update/
> But "instituted" would work just as well. It may be that "instated a
> policy" is just an idiosyncratic back-formation of "reinstated a policy",
> which even to me feels more right.
> Not Versioning
>>> --------------
>>> For a long time, we considered that the way to allow algorithmic
>>> improvements
>>> while maintaining the stream was to apply some form of versioning.  That
>>> is,
>>> every time we make a stream change in one of the distributions, we
>>> increment
>>> some version number somewhere.  ``numpy.random`` would keep all past
>>> versions
>>> of the code, and there would be a way to get the old versions.
>>> Proposals of
>>> how to do this exactly varied widely, but we will not exhaustively list
>>> them
>>> here.  We spent years going back and forth on these designs and were not
>>> able
>>> to find one that sufficed.  Let that time lost, and more importantly, the
>>> contributors that we lost while we dithered, serve as evidence against
>>> the
>>> notion.
>>> Concretely, adding in versioning makes maintenance of ``numpy.random``
>>> difficult.  Necessarily, we would be keeping lots of versions of the
>>> same code
>>> around.  Adding a new algorithm safely would still be quite hard.
>>> But most importantly, versioning is fundamentally difficult to *use*
>>> correctly.
>>> We want to make it easy and straightforward to get the latest, fastest,
>>> best
>>> versions of the distribution algorithms; otherwise, what's the point?
>>> The way
>>> to make that easy is to make the latest the default.  But the default
>>> will
>>> necessarily change from release to release, so the user’s code would
>>> need to be
>>> altered anyway to specify the specific version that one wants to
>>> replicate.
>>> Adding in versioning to maintain stream-compatibility would still only
>>> provide
>>> the same level of stream-compatibility that we currently do, with all of
>>> the
>>> limitations described earlier.  Given that the standard practice for
>>> such needs
>>> is to pin the release of ``numpy`` as a whole, versioning
>>> ``RandomState`` alone
>>> is superfluous.
>> This section is a bit unclear. Would it be correct to say that the rng
>> version is the numpy version? If so, it might be best to say that up front
>> before justifying it.
> I'm sorry, I'm unclear on what you are asking me to make clearer. There is
> currently no such thing as "the rng version". The thrust of this section of
> the NEP is to reject the previously floated idea of introducing the concept
> at all. So I would certainly not say anything along the lines that "the rng
> version is the numpy version". I do say, here and earlier, that the way to
> get the same RNG code is to get the same version of numpy.

Just so, and you could make that clearer, as you do here.

> Mostly off topic, but I note that the new module proposes integers of
>> various lengths using the Python half open ranges. I would like to suggest
>> that we modify that just a hair so we can specify the whole range in the
>> integer interval specification. For instance, the full range of an 8 bit
>> unsigned integer could be given as `(0, 0)`, i.e., (0, 255 + 1). This would
>> be most useful for the biggest (64 bit) types, but I am more thinking of
>> the case where sequences of ranges can be used.
> That is indeed something out of scope for this NEP discussion. Feel free
> to open an issue on the randomgen Github. But suffice it to say that I
> intend to make sure that the new subsystem has at least feature parity with
> the current code, and that is one of the features in the current code.
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180603/975f3035/attachment.html>

More information about the NumPy-Discussion mailing list