SHA-based subclass for Random module

Josiah Carlson jcarlson at nospam.uci.edu
Mon Mar 15 19:32:04 EST 2004


> This is intended to be less predicable/have fewer correlations than
> the default Mersenne Twister or Wichmann-Hill generators.  Comments
> are appreciated.

FYI: Mersenne Twister is a pretty spiffy pseudo random number 
generator...it is also /really/ fast.

 From what I can gather, you are using the SHA hash to behave like a 
random-number generator through updating the hash with a constant string 
and the previous state.

On the one hand, the SHA hash is not known to be generally invertable 
for any of the uniquely-mapped 160 bit input strings, or really for any 
non-trivial initial states or input strings.  It could very well be that 
it is not invertable for any string or state, but I've not done suitable 
research to know for sure.  Quite nice for an invertable perspective and 
suggesting that short cycles in state are rare, if even possible.

On the other hand, you could just as easily use the MD5 hash.  The only 
thing you get from SHA is perhaps more bits of state.  I don't know 
which one is faster though.


Now, a good question to be asked is, "What the hell do the SHA and MD5 
hashes do?"

Really, they take input character-by-character, perterb some current 
state with some non-trivially flowing functions with feedback, and 
produce a seemingly-random output that /should/ be uniform over the 
output space.  Whether or not they produce output that /is/ uniform over 
the output space is probably the result of some involved paper that I've 
not read.


One thing to remember is that the Mersenne Twister is known to cycle in 
state only after 2**19937-1 calls.  Since you haven't done any 
measurable analysis of your method, or how often it would cycle in state 
during the repeated re-setting of the state to the hash of some 
relatively short input-string, that is only pseudo-random near the end 
(the previous hash), I expect that yours may not fare much (if any) better.


One could exploit the features of both, and perhaps end up with a random 
number generator that cycles less often (if sha cycles in S iterations, 
and the mersenne twister cycles in M iterations, a new algorithm that 
cycles in SM iterations is possible, if S and M are coprime), but 
without actually analyzing their interaction, the behavior of such a 
random number generator is really only open for speculation.

  - Josiah



More information about the Python-list mailing list