[Numpy-discussion] Adopt Mersenne Twister 64bit?

Robert Kern robert.kern at gmail.com
Mon Mar 11 05:46:54 EDT 2013

On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam <siu at continuum.io> wrote:
> Hi all,
> I am redirecting a discussion on github issue tracker here.  My original
> post (https://github.com/numpy/numpy/issues/3137):
> "The current implementation of the RNG seems to be MT19937-32. Since 64-bit
> machines are common nowadays, I am suggesting adding or upgrading to
> MT19937-64.  Thoughts?"
> Let me start by answering to njsmith's comments on the issue tracker:
> Would it be faster?
> Although I have not benchmarked the 64-bit implementation, it is likely that
> it will be faster on a 64-bit machine since the number of iteration
> (controlled by NN and MM in the reference implementation
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c)
> is reduced by half.  In addition, each generation in the 64-bit
> implementation produces a 64-bit random int which can be used to generate
> double precision random number.  Unlike the 32-bit implementation which
> requires generating a pair of 32-bit random int.

>From the last time this was brought up, it looks like getting a single
64-bit integer out from MT19937-64 takes about the same amount of time
as getting a single 32-bit integer from MT19937-32, perhaps a little
slower, even on a 64-bit machine.


So getting a single double would be not quite twice as fast.

> But, on a 32-bit machine, a 64-bit instruction is translated into 4 32-bit
> instructions; thus, it is likely to be slower.  (1)
> Use less memory?
> The amount of memory use will remain the same.  The size of the RNG state is
> the same.
> Provide higher quality randomness?
> My naive answer is that 32-bit and 64-bit implementation have the same
> 2^19937-1 period. Need to do some research and experiments.
> Would it change the output of this program: import numpy
> numpy.random.seed(0) print numpy.random.random() ?
> Unfortunately, yes.  The 64-bit implementation generates a different random
> number sequence with the same seed. (2)
> My suggestion to overcome (1) and (2) is to allow the user to select between
> the two implementations (and possibly different algorithms in the future).
> If user does not provide a choice, we use the MT19937-32 by default.
>         numpy.random.set_state("MT19937_64", …)   # choose the 64-bit
> implementation

Most likely, the different PRNGs should be different subclasses of
RandomState. The module-level convenience API should probably be left
alone. If you need to control the PRNG that you are using, you really
need to be passing around a RandomState instance and not relying on
reseeding the shared global instance. Aside: I really wish we hadn't
exposed `set_state()` in the module API. It's an attractive nuisance.

There is some low-level C work that needs to be done to allow the
non-uniform distributions to be shared between implementations of the
core uniform PRNG, but that's the same no matter how you organize the
upper layer.

Robert Kern

More information about the NumPy-Discussion mailing list