[Numpy-discussion] Moving NumPy's PRNG Forward

Kevin Sheppard kevin.k.sheppard at gmail.com
Fri Jan 26 11:14:07 EST 2018


I am a firm believer that the current situation is not sustainable.  There
are a lot of improvements that can practically be incorporated.  While many
of these are performance related, there are also improvements in accuracy
over some ranges of parameters that cannot be incorporated. I also think
that perfect stream reproducibility is a bit of a myth across versions
since this really would require identical OS, compiler and possibly CPU for
some of the generators that produce floats.

I believe there is a case for separating the random generator from core
NumPy.  Some points that favor becoming a subproject:

1. It is a pure consumer of NumPy API.  Other parts of the API do no depend
on random.
2. A stand alone package could be installed along side many different
version of core NumPy which would reduce the pressure on freezing the
stream.

In terms of what is needed, I think that the underlying PRNG should be
swappable.  The will provide a simple mechanism to allow certain types of
advancement while easily providing backward compat.  In the current design
this is very hard and requires compiling many nearly identical copies of
RandomState. In pseudocode something like

standard_normal(prng)

where prng is a basic class that retains the PRNG state and has a small set
of core random number generators that belong to the underlying PRNG --
probably something like int32, int64, double, and possibly int53. I am not
advocating explicitly passing the PRNG as an argument, but having
generators which can take any suitable PRNG would add a lot of flexibility
in terms of taking advantage of improvements in the underlying PRNGs (see,
e.g., xoroshiro128/xorshift1024).  The "small" core PRNG would have
responsibility over state and streams.  The remainder of the module would
transform the underlying PRNG into the required distributions.

This would also simplify making improvements, since old versions could be
saved or improved versions could be added to the API.  For example,

from numpy.random import standard_normal, prng # Preferred versions
standard_normal(prng)  # Ziggurat
from numpy.random.legacy import standard_normal_bm, mt19937 # legacy
generators
standard_normal_bm(mt19937) # Box-Muller

Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180126/533cac0f/attachment.html>


More information about the NumPy-Discussion mailing list