[Numpy-discussion] Proposal: numpy.random.random_seed

Wed May 18 13:20:57 EDT 2016

On Wed, May 18, 2016 at 12:01 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Wed, May 18, 2016 at 4:50 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
> >>
> >> > ...anyway, the real reason I'm a bit grumpy is because there are solid
> >> > engineering reasons why users *want* this API,
> >
> > Honestly, I am lost in the math -- but like any good engineer, I want to
> accomplish something anyway :-) I trust you guys to get this right -- or at
> least document what's "wrong" with it.
> >
> > But, if I'm reading the use case that started all this correctly, it
> closely matches my use-case. That is, I have a complex model with multiple
> independent "random" processes. And we want to be able to re-produce
> EXACTLY simulations -- our users get confused when the results are
> "different" even if in a statistically insignificant way.
> >
> > At the moment we are using one RNG, with one seed for everything. So we
> get reproducible results, but if one thing is changed, then the entire
> simulation is different -- which is OK, but it would be nicer to have each
> process using its own RNG stream with it's own seed. However, it matters
> not one whit if those seeds are independent -- the processes are different,
> you'd never notice if they were using the same PRN stream -- because they
> are used differently. So a "fairly low probability of a clash" would be
> totally fine.
>
> Well, the main question is: do you need to be able to spawn dependent
> streams at arbitrary points to an arbitrary depth without coordination
> between processes? The necessity for multiple independent streams per se is
> not contentious.
>

I'm similar to Chris, and didn't try to figure out the details of what you
are talking about.

However, if there are functions getting into numpy that help in using a
best practice even if it's not bullet proof, then it's still better than
home made approaches.
If it get's in soon, then we can use it in a few years (given dependency
lag). At that point there should be more distributed, nested simulation
based algorithms where we don't know in advance how far we have to go to
get reliable numbers or convergence.

(But I don't see anything like that right now.)

Josef

>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160518/f5c2b4d9/attachment.html>