[Numpy-discussion] stability of numpy.random.RandomState API?

Robert Kern robert.kern at gmail.com
Thu Nov 6 16:55:20 EST 2008


On Thu, Nov 6, 2008 at 15:12, Barry Wark <barrywark at gmail.com> wrote:
> On Thu, Nov 6, 2008 at 12:09 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Thu, Nov 6, 2008 at 14:05, Barry Wark <barrywark at gmail.com> wrote:
>>> I'm just about to embark on a long-term research project and was
>>> planning to use numpy.random to generate stimuli for our experiments.
>>> We plan to store only the parameters and RandomState seed for each
>>> stimulus and I'm concerned about stability of the API in the long
>>> term: will the parameters and random seed we store now work with
>>> future versions of numpy.random?
>>
>> It should. But just in case, make sure you explicitly instantiate
>> RandomState objects instead of using the functions in numpy.random.
>> That way, should we need to fix some bug that might change the
>> results, you can always pull out the current mtrand code and use it
>> independently.
>
> That is our working plan, as well as to record the numpy.__version__
> which was used to generate the original stimulus. Thanks for the
> confirmation.
>
> On a side note, this seems like a potentially big issue for many
> scientific users. Perhaps making a policy of keeping incompatible
> revisions to  RandomState noted in its documentation (if they ever
> come up) would be useful. Even better, a module function or class
> method that returns an instance of RandomState as it was at a
> particular numpy version:
>
> r = numpy.random.RandomState.from_version(my_numpy_version, seed=None)
>
> Hmm. Sounds like a bit of work. I'll give it a go, if you think this
> is a valuable approach.
>
>>
>>> I think I recall that there was a
>>> change in the random seed format some time around numpy 1.0.
>>
>> I don't think I changed it after 1.0. Before 1.0, we explicitly warned
>> people about API instability.
>
> I believe you. We've been developing this app since before numpy 1.0,
> so I'm sure the issue cropped up from data generated pre-1.0.

Okay. Actually, now that I think about it, there have been changes
that would affect results using the nonuniform distributions. These
should only have arisen from fixing bugs (i.e. the previous results
were wrong, not just different). Do you have any thoughts on how you
would want us to handle that case?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list