numpy performance and random numbers
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Dec 19 19:55:48 EST 2009
On Sat, 19 Dec 2009 13:58:37 -0800, sturlamolden wrote:
> On 19 Des, 21:26, Lie Ryan <lie.1... at gmail.com> wrote:
>
>> you can just start two PRNG at two distinct states
>
> No. You have to know for certain that the outputs don't overlap.
"For certain"? Why?
Presumably you never do a Monte Carlo simulation once, you always repeat
the simulation. Does it matter if (say) one trial in fifty is lower-
quality than the rest because the outputs happened to overlap? What if
it's one trial in fifty thousand? So long as the probability of overlap
is sufficiently small, you might not even care about doing repeated
simulations.
> If you pick two random states (using any PRNG), you need error- checking
> that states are always unique, i.e. that each PRNG never reaches the
> starting state of the other(s). If I had to do this, I would e.g. hash
> the state to a 32 bit integer. You can deal with errors by jumping to a
> different state or simply aborting the simulation. The problem is that
> the book-keeping will be costly compared to the work involved in
> generating a single pseudorandom deviate. So while we can construct a
> parallel PRNG this way, it will likely run slower than one unique PRNG.
Since truly random sequences sometimes produce repeating sequences, you
should be able to tolerate short periods of overlap. I'd consider the
following strategy: for each of your parallel PRNGs other than the first,
periodically jump to another state unconditionally. The periods should be
relatively prime to each other, e.g. the second PRNG jumps-ahead after
(say) 51 calls, the third after 37 calls, etc. (the exact periods
shouldn't be important). Use a second, cheap, PRNG to specify how far to
jump ahead. The overhead should be quite small: a simple counter and test
for each PRNG, plus a small number of calls to a cheap PRNG, and
statistically you should expect no significant overlap.
Is this workable?
--
Steven
More information about the Python-list
mailing list