[Scott David Daniels]
I find I have a need in randomized testing for a shorter version of getstate, even if it _is_ slower to restore. When running exhaustive tests, a failure report should show the start state of the generator. Unfortunately, our current state includes a 625-element array. I want a state that can be read off a report and typed in to reproduce the state. Something a bit like the initial seed, a count of cycle calls, and a few other things.
Sounds like you could easily wrap the generator to get this. It would slow you down but would give the information you want.
I think it would be a mistake to complexify the API to accomodate short states -- I'm not even sure than they are generally useful (recording my initial seed and how many cycles I've run through is only helpful for sequences short enough that I'm willing to rerun them).
I'm curious what your use case is. Why not just record the the sequence as generated -- I don't see any analytic value to just knowing the initial seed and cycle count.
Ability to print out a short state implies that you are using only a small subset of possible states (i.e. the ones you can get to with a short seed). A short state print out isn't even possible if you actually have a random initial state (every state having an equal chance of being the starting point).
In trying to get this to work, I found what might be a bug: code says mt = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */ but probably should be: mt |= 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
Please file a bug report for this and assign to me. I put in the existing MT code and took it directly from the author's published (and widely tested code). Also, our tests for MT exactly reproduce their published test sequence. But, if there is an error, I would be happy to fix it.
In checking into that issue, I went to the original Mersenne-Twister code, and I see the original authors are pursuing a newer generator, dSFMT.
The MT itself has the advantage of having been widely exercised and tested. The newer generator may have more states but has not been as extensively tested.
I now have a dilemma. Should I continue the work on the original M-T code (which is now seeming problematic for compatibility) or simply make a new generator with similar calls using dSFMT and put the new feature in that where there is no compatibility problem. Which would be more useful for the Python community?
It's not hard to subclass Random and add different generators. Why not publish some code on ASPN and see how it gets received. I've put a recipe there for a long period generator, http://code.activestate.com/recipes/576707/ , but there doesn't seem to have been any real interest in generators with longer periods than MT.