SHA-based subclass for Random module
Josiah Carlson
jcarlson at nospam.uci.edu
Mon Mar 15 19:32:04 EST 2004
> This is intended to be less predicable/have fewer correlations than
> the default Mersenne Twister or Wichmann-Hill generators. Comments
> are appreciated.
FYI: Mersenne Twister is a pretty spiffy pseudo random number
generator...it is also /really/ fast.
From what I can gather, you are using the SHA hash to behave like a
random-number generator through updating the hash with a constant string
and the previous state.
On the one hand, the SHA hash is not known to be generally invertable
for any of the uniquely-mapped 160 bit input strings, or really for any
non-trivial initial states or input strings. It could very well be that
it is not invertable for any string or state, but I've not done suitable
research to know for sure. Quite nice for an invertable perspective and
suggesting that short cycles in state are rare, if even possible.
On the other hand, you could just as easily use the MD5 hash. The only
thing you get from SHA is perhaps more bits of state. I don't know
which one is faster though.
Now, a good question to be asked is, "What the hell do the SHA and MD5
hashes do?"
Really, they take input character-by-character, perterb some current
state with some non-trivially flowing functions with feedback, and
produce a seemingly-random output that /should/ be uniform over the
output space. Whether or not they produce output that /is/ uniform over
the output space is probably the result of some involved paper that I've
not read.
One thing to remember is that the Mersenne Twister is known to cycle in
state only after 2**19937-1 calls. Since you haven't done any
measurable analysis of your method, or how often it would cycle in state
during the repeated re-setting of the state to the hash of some
relatively short input-string, that is only pseudo-random near the end
(the previous hash), I expect that yours may not fare much (if any) better.
One could exploit the features of both, and perhaps end up with a random
number generator that cycles less often (if sha cycles in S iterations,
and the mersenne twister cycles in M iterations, a new algorithm that
cycles in SM iterations is possible, if S and M are coprime), but
without actually analyzing their interaction, the behavior of such a
random number generator is really only open for speculation.
- Josiah
More information about the Python-list
mailing list