numpy performance and random numbers
Peter Pearson
ppearson at nowhere.invalid
Sun Dec 20 20:41:17 EST 2009
On Sun, 20 Dec 2009 07:26:03 +1100, Lie Ryan <lie.1296 at gmail.com> wrote:
> On 12/20/2009 4:02 AM, Carl Johan Rehn wrote:
>
>>>>> Parallel PRNGs are an unsolved problem in computer science.
>>
>> Thanks again for sharing your knowledge. I had no idea. This means
>> that if I want to speed up my application I have to go for the fastest
>> random generator and focus on other parts of my code that can be
>> vectorized.
>
> If you don't care about "repeatability" (which is already extremely
> difficult in parallel processing even without random number generators),
> you can just start two PRNG at two distinct states (and probably from
> two different algorithms) and they will each spews out two independent
> streams of random numbers. What was "unsolved" was the "pseudo-" part of
> the random number generation, which guarantee perfect replayability in
> all conditions.
Why not use a good cipher, such as AES, to generate a pseudorandom
bit stream by encrypting successive integers? If you use a different
key for each instance, you'll have exactly the independence you want.
And if you can detect any statistical anomaly in the output, you
automatically have the most exciting paper to be published in the next
issue of the Journal of Cryptology.
Minor caveats:
1. This might be slower than another approach, but maybe not:
AES is much faster than the ciphers of the old days.
2. Since AES(key,i) != AES(key,j) if i != j, there will be a
dearth of duplicates, which will become statistically
detectable around the time i has been incremented
2**64 times. There are many reasons why this might not
bother you (one: you don't plan to use so many values;
two: you might use the 128-bit AES output in pieces, rather
than as a single value, in which case duplicates will
appear among the pieces at the right rate), but if it *does*
bother you, it can be fixed by using i^AES(key,i) instead
of just AES(key,i).
--
To email me, substitute nowhere->spamcop, invalid->net.
More information about the Python-list
mailing list