[Numpy-discussion] [Scikit-learn-general] random number generator, entropy and pickling

Robert Kern robert.kern at gmail.com
Mon Apr 25 14:23:12 EDT 2011


On Mon, Apr 25, 2011 at 13:15, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Mon, Apr 25, 2011 at 11:05:05AM -0700, T J wrote:
>> If code A relies on code B (eg, some numpy function) and code B
>> changes, then the stream of random numbers will no longer be the same.
>>  The point here is that the user wrote code A but depended on code B,
>> and even though code A was unchanged, their random numbers were not
>> the same.
>
> Yes, that's exactly why we want the different objects to able to recieve
> their own PRNG.

But seriously, they are running A+B, the combination of A and B. If
A+B changes to A+B', then the results may be different. That's to be
expected.

>> The situation is improved if scikits.learn used its own global
>> RandomState instance.  Then code A will at least give the same stream
>> of random numbers for a fixed version of scikits.learn.  It should be
>> made very clear though that the data stream cannot be expected to be
>> the same across versions.
>
> The use case that we are trying to catter for, with the global PRNG, is
> for mister Joe average, who is used to setting the numpy PRNG to control
> what is going on.

Honestly, they really shouldn't be, except as a workaround to
poorly-written functions that don't let you pass in your own PRNG.
Someone snuck in the module-level alias to the global PRNG's seed()
method when I wasn't paying attention. :-)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list