[Numpy-discussion] SFMT (faster mersenne twister)
Nathaniel Smith
njs at pobox.com
Tue Sep 9 14:08:52 EDT 2014
On 8 Sep 2014 14:43, "Robert Kern" wrote:
>
On Mon, Sep 8, 2014 at 6:05 PM, Pierre-Andre Noel wrote:
> <noel.pierre.andre at gmail.com> wrote:
I think we could add new generators to NumPy though,
perhaps with a keyword to control the algorithm (defaulting to the
> > current
Mersenne Twister).
> >
Why not do something like the C++11 <random>? In <random>, a "generator"
is the engine producing randomness, and a "distribution" decides what is
the type of outputs that you want. Here is the example on
http://www.cplusplus.com/reference/random/ .
> >
std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(1,6);
int dice_roll = distribution(generator); // generates number in
the range 1..6
> >
For convenience, you can bind the generator with the distribution (still
from the web page above).
> >
auto dice = std::bind(distribution, generator);
int wisdom = dice()+dice()+dice();
> >
Here is how I propose to adapt this scheme to numpy. First, there would
be a global generator defaulting to the current implementation of
Mersene Twister. Calls to numpy's "RandomState", "seed", "get_state" and
"set_state" would affect this global generator.
> >
All numpy functions associated to random number generation (i.e.,
everything listed on
http://docs.scipy.org/doc/numpy/reference/routines.random.html except
for "RandomState", "seed", "get_state" and "set_state") would accept the
kwarg "generator", which defaults to the global generator (by default
the current Mersene Twister).
> >
Now there could be other generator objects: the new Mersene Twister,
some lightweight-but-worse generator, or some cryptographically-safe
random generator. Each such generator would have "RandomState", "seed",
"get_state" and "set_state" methods (except perhaps the
criptographically-safe ones). When calling a numpy function with
generator=my_generator, that function uses this generator instead the
global one. Moreover, there would be be a function, say
select_default_random_engine(generator), which changes the global
generator to a user-specified one.
>
I think the Python standard library's example is more instructive. We
have new classes for each new core uniform generator. They will share
a common superclass to share the implementation of the non-uniform
distributions. numpy.random.RandomState will continue to be the
current Mersenne Twister implementation, and so will the underlying
global RandomState for all of the convenience functions in
numpy.random. If you want the SFMT variant, you instantiate
numpy.random.SFMT() and call its methods directly.
There's also another reason why generator decisions should be part of the
RandomState object itself: we may want to change the distribution methods
themselves over time (e.g., people have been complaining for a while that
we use a suboptimal method for generating gaussian deviates), but changing
these creates similar backcompat problems. So we need a way to say "please
give me samples using the old gaussian implementation" or whatever.
-n
