[Numpy-discussion] skip samples in random number generator

Robert Kern robert.kern at gmail.com
Thu Oct 2 17:24:10 EDT 2014


On Thu, Oct 2, 2014 at 9:52 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Robert Kern <robert.kern at gmail.com> wrote:
>
>> No one needs small jumps of arbitrary size. The real use case for
>> jumping is to make N parallel streams that won't overlap. You pick a
>> number, let's call it `jump_steps`, much larger than any single run of
>> your system could possibly consume (i.e. the number of core PRNG
>> variates pulled is << `jump_steps`). Then you can initializing N
>> parallel streams by initializing RandomState once with a seed, storing
>> that RandomState, then jumping ahead by `jump_steps`, storing *that*
>> RandomState, by `2*jump_steps`, etc. to get N RandomState streams that
>> will not overlap. Give those to your separate processes and let them
>> run.
>>
>> So the alternative may actually be to just generate and distribute
>> *one* set of these jump coefficients for a really big jump size but
>> still leaves you enough space for a really large number of streams
>> (fortunately, 2**19937-1 is a really big number).
>
> DCMT might be preferred in this case. It works the same, except you have N
> "random state" streams with characteristic polynomials that are distinct
> and relatively prime to each other. Thus each of the N processes will get
> an independent stream of random numbers, without any chance of overlap.
>
> http://www.math.sci.hiroshima-u.ac.jp/∼m-mat/MT/DC/dc.html

Yes, but that would require rewriting much of numpy.random to allow
replacing the core generator. This would work out-of-box because it's
just manipulating the state of the current core generator.

-- 
Robert Kern



More information about the NumPy-Discussion mailing list