SHA-based subclass for random module
dmq at gain.com
Mon Mar 22 21:18:31 CET 2004
On Mon, 22 Mar 2004 13:41:46 -0500, "Tim Peters" <tim.one at comcast.net>
>> I'm still curious as to the answer to my very simple and basic
>It is basic, but there's nothing simple about it.
>> Are there any examples that show a *non-subjective* difference in a
>> physically meaningful result between one PRNG and another?
>Yes, but different RNGs have different weaknesses, and whether the
>weaknesses of a specific RNG *matter* to you depend on detailed analysis of
>the specific app you have in mind.
>The primary weakness of linear congruential generators (still the most
>common kind) is dramatically illustrated all over the web; e.g., look at the
>3D plot at:
>If you need to generate "random" points in 3D space, does it matter that the
>particular LCG plotted there systematically generates points that fall on
>one of 15 planes, no matter how many points you generate? There's no answer
>to that short of analyzing the specifics of what you're trying to accomplish
>by generating those points. For example, if your app relies on the triples
>generated being approximately equally distributed throughout a 3D cube, that
>particular LCG is an obvious disaster for that particular application. But
>if your app only cares that the projection of the points onto the x axis is
>approximately equally distributed, probably no problem, despite that it's
>obvious "by eyeball" that there's nothing even remotely "random" about their
>distribution in 3D space.
>You're not going to get an easy answer. Well, here's one: the Mersenne
>Twister in Python 2.3 is probably the best and most widely tested fast PRNG
>in existence now, and has provably good equidistribution properties in high
>dimensions (a plot like the one above would have no visible patterns if the
>Twister had been used instead).
This is the first good answer I have heard, and it has changed my
thinking on this question. I understand now that the problem is not
just something that might be found in the tenth decimal place. The
clustering of random points into very thin planes could radically
affect a simulation depending on the distances between those points.
The problem I am working on (statistical simulation of electronic
circuits) involves a large number of 1D distributions, but with
carefully controlled correlations between these distributions. We are
probably just lucky that the PRNG we are using has not caused a
noticable problem, but seeing just one clear example like the above
has made me decide to be much more cautious in selecting a PRNG.
Thanks for the enlightenment.
More information about the Python-list