[Python-Dev] Mersenne Twister

Tim Peters tim.one@comcast.net
Thu, 29 Aug 2002 15:45:41 -0400


[Raymond Hettinger]
> I'm sketching out an approach to the Mersenne Twister and
> wanted to make sure it is in line with what you want.
>
> -- Write it in pure python as a drop-in replacement for Wichman-Hill.

I'd rather the Twister were in C -- it's low-level bit-fiddling, and Python
isn't well-suited to high-throughput bit fiddling.  IIRC, Ivan Frohne
eventually got his pure-Python Twister implementation (see earlier msg) to
within 15% of whrandom's speed, but it *could* be 10x faster w/o even
trying.

> -- Add a version number argument to Random() which defaults to two.
>    If set to one, use the old generator so that it is possible
>    to recreate sequences from earlier versions of Python.  Note, the code
>    is much shorter if we drop this requirement.  On the plus side, it
gives
>    more than backwards compatability,

Backwards compatability is essential, at least in the sense that there's
*some* way to exactly reproduce results from earlier Python releases.  The
seed function in whrandom.py was very weak (for reasons explained in
random.py), and I did make that incompatible when improving it, but also
introduced whseed so that there was *some* way to reproduce older results
bit-for-bit.  I've gotten exactly one complaint about the incompatibility
since then.

Note that new generators are intended to be introduced via sublassing of
Random:

    # Specific to Wichmann-Hill generator.  Subclasses wishing to use a
    # different core generator should override the seed(), random(),
    # getstate(), setstate() and jumpahead() methods.

I don't believe the jumpahead() method can be usefully implemented for the
Twister.

>    it gives the ability to re-run a simulation with another generator to
>    assure that the result isn't a fluke related to a generator design
flaw.

That's very important in real life.  Ivan Frohne's design (and alternative
generators) should be considered here.

> -- Document David Abrahams's link to
>     http://www.boost.org/boost/random/mersenne_twister.hpp as the
>     reference implementation and
>     http://www.math.keio.ac.jp/matumoto/emt.html as a place for
>     more information.  Key-off of the MT19337 version as the most
>     recent stable evolution.

I'd simply use the authors' source code.  You don't get bonus points for
ripping out layers of C++ templates someone else gilded the lily under
<wink>.

> -- Move the existing in-module test-suite into a unittest.

Easier said than done.  The test suite doesn't do anything except print
results -- it has no intelligence about whether the results are good or bad.
It was expected that a bona fide expert would stare at the output.

>    Add a new, separate unittest suite with tests specific to MT
(recreating
>    a few sequences produced by reference implementations)

I doubt you need a distinct test file for that.

>    and with a battery of Knuth style tests.

They're far behind the current state of the art, and, as Knuth mentions in
an exercise, it's "term project" level effort to implement them even so.

>    The validation results are at:
>    http://www.math.keio.ac.jp/matumoto/ver991029.html
>
> -- When we're done, have a python link put on the Mersenne Twister
>     Home Page (the second link above).

Yes!  Cool idea.

> -- Write, test and document the generator first.

As opposed to what, eating dinner first <wink>?

>     Afterwards, explore
>     techniques for creating multiple independent streams:
>     http://www.math.h.kyoto-u.ac.jp/~matumoto/RAND/DC/dc.html

Agreed that should be delayed.