[Python-Dev] random number generator state

Raymond Hettinger python at rcn.com
Sat Aug 15 23:58:10 CEST 2009


[Scott David Daniels]
>I find I have a need in randomized testing for a shorter version
> of getstate, even if it _is_ slower to restore.  When running
> exhaustive tests, a failure report should show the start state
> of the generator.  Unfortunately, our current state includes a
> 625-element array.  I want a state that can be read off a report
> and typed in to reproduce the state.  Something a bit like the
> initial seed, a count of cycle calls, and a few other things.

Sounds like you could easily wrap the generator to get this.
It would slow you down but would give the information you want.

I think it would be a mistake to complexify the API to accomodate
short states -- I'm not even sure than they are generally useful
(recording my initial seed and how many cycles I've run through
is only helpful for sequences short enough that I'm willing to rerun
them).

I'm curious what your use case is.  Why not just record the 
the sequence as generated -- I don't see any analytic value to
just knowing the initial seed and cycle count.  

Ability to print out a short state implies that you are using only a
small subset of possible states (i.e. the ones you can get to with
a short seed).  A short state print out isn't even possible if you actually
have a random initial state (every state having an equal chance of
being the starting point).



>  In trying to
> get this to work, I found what might be a bug:
> code says
>   mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
> but probably should be:
>   mt[0] |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array */

Please file a bug report for this and assign to me.  I put in the existing
MT code and took it directly from the author's published (and widely
tested code).  Also, our tests for MT exactly reproduce their published test
sequence.  But, if there is an error, I would be happy to fix it.



> In checking into that issue, I went to the original Mersenne-Twister
> code, and I see the original authors are pursuing a newer generator,
> dSFMT.

The MT itself has the advantage of having been widely exercised and
tested.  The newer generator may have more states but has not been
as extensively tested.


> I now have a dilemma.  Should I continue the work on the original M-T
> code (which is now seeming problematic for compatibility) or simply make
> a new generator with similar calls using dSFMT and put the new feature
> in that where there is no compatibility problem.  Which would be more
> useful for the Python community?

It's not hard to subclass Random and add different generators.  Why not
publish some code on ASPN and see how it gets received.  I've put a
recipe there for a long period generator, http://code.activestate.com/recipes/576707/ ,
but there doesn't seem to have been any real interest in generators with
longer periods than MT. 


Raymond

 


More information about the Python-Dev mailing list