[Numpy-discussion] Purpose for bit-wise and'ing the initial mersenne twister key?

Robert Kern robert.kern at gmail.com
Thu Feb 12 15:32:02 EST 2009


On Thu, Feb 12, 2009 at 14:17, Michael S. Gilbert
<michael.s.gilbert at gmail.com> wrote:
> On Thu, 12 Feb 2009 13:18:26 -0600 Robert Kern wrote:
>> > I did some testing with this 64-bit implementation (mt19937-64). I've
>> > found that it is actually slower than the 32-bit reference (mt19937ar)
>> > on 64-bit systems (2.15s vs 2.25s to generate 100000000 ints). This is
>> > likely because it generates 64-bit long long ints instead of 32-bit
>> > long ints. However, it should be possible to break up each 64-bit int
>> > into two 32-bit ints, then the runtime would appear to be almost twice
>> > as fast.
>>
>> Why do you think that?
>
> You could also think of it the other way (in terms of generating 64-bit
> ints). Instead of generating two 32-bit rints and concatenating them
> for a 64-bit int, you can just directly generate the 64-bit int. Since
> the 64-bit int requires only slightly more time to generate than either
> of the 32-bit ints individually, an almost 2x speedup is achieved.

<shrug> I'll believe it when I see it.

>> > One other consideration to keep in mind is that the 64-bit
>> > version is not stream-compatible with the 32-bit implementation (you
>> > will get different sequences for the same input seed).
>> >
>> > Would it be worth it to implement this in numpy in order to get an
>> > almost 2x speedup on 64-bit machines?
>>
>> The incompatibility is a showstopper to replacing the PRNG on any platform.
>
> Why is stream-compatibility such a stringent requirement?  It seems
> like this contstraint majorly limits your ability to adopt new/better
> technologies and reduces your flexibility to make changes to your code
> as needed.  What end-use applications require stream-compatibility?  The
> only thing I can think of is verification/regression testing for numpy,
> but those test cases could be updated to account for the break in
> compatibility (and specifically using the reference implementation's
> expected output).  Wouldn't sufficient documentation of the change in
> behavior be sufficient?

Some people don't think so. People have asked for more stringent
compatibility than we can already provide (i.e. replicability even in
the face of bug fixes). People use these as inputs to their scientific
simulations. I'm not going to intentionally make their lives harder
than that.

Bruce Southey was working on exposing the innards a bit so that you
could make use the a different core PRNG while reusing the
numpy-specific stuff in RandomState. That would be the approach to
apply different technologies.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list