[Numpy-discussion] overhauling numpy.random and randomgen Message-ID:

Stephan Hoyer shoyer at gmail.com
Fri Apr 19 13:21:18 EDT 2019


On Fri, Apr 19, 2019 at 5:16 AM Neal Becker <ndbecker2 at gmail.com> wrote:

> The boost_random c++ library uses the terms 'generators' and
> 'distributions'.  Distributions are applied to generators.
>

"distributions" is a little confusing in the context of
scipy.stats.distributions, which a distribution corresponds to a particular
probability distribution.


> On Fri, Apr 19, 2019 at 7:54 AM Kevin Sheppard
> <kevin.k.sheppard at gmail.com> wrote:
> >
> > >  Rather than "base RNG", what about calling these classes a "random
> source"
> > or "random stream"? In particular, I would suggest defining two Python
> > classes:
> > > - np.random.Generator as a less redundant name for what is currently
> called
> > RandomGenerator
> > > - np.random.Source or np.random.Stream as an abstract base class for
> what
> > are currently called "base RNGs"
> >
> > Naming is definitely hard.  Simple RNGs are currently called basic RNGs
> which was inspired by mkl-random. 'source' sounds OK to me, but sort of
> hides the fact that these are the actual Psuedo RNGs. `stream` has a
> technical meaning (a single PRNG make produce multiple independent streams)
> and IMO should be avoided since this might lead to confusion.  Perhaps
> source_rng (or in docs Source RNG)?
> >
> > RandomGenerator is actually RandomTransformer, but I didn't like the
> latter name.
> >
> > > There are also a couple of convenience attributes in the user-facing
> API
> > that I would suggest refining:
> > >   - The "brng" attribute of RandomGenerator is not a very descriptive
> name. I
> > would prefer "stream" or "source", or the more explicit "base_rng" if we
> > stick with that term.
> > >   - I don't think we need the "generator" property on base RNG
> objects. It is
> > fine to require writing np.random.Generator(base) instead. Looking at the
> > implementation, .generator caches the RandomGenerator objects it creates
> on
> > the base RNG, which creates a reference cycle. Yes, Python can garbage
> > collect reference cycles, but this is still a muddled data model.
> >
> > The attribute name should match the final (descriptive) name, whatever
> it is.  In RandomGen I am using the `basic_rng` attribute name, but this
> could be `source`.  I also use a property so that the attribute can have a
> docstring attached for use in IPython. I think this is more user-friendly.
> >
> > I think dropping the `generator` property on the basic RNGs is
> reasonable.  It was a convenience but is awkward, and I always understood
> that it creates a cycle.
> >
> > > Finally, why do we expose the np.random.gen object? I thought part of
> the
> > idea with the new API was to avoid global mutable state.
> >
> > Module level functions are essential for quick experiments and should be
> provided.  The only difference here is that the singleton `seed`  and
> `state` are no longer exposed so that it isn't possible (using the exposed
> API) to set the seed.
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Those who don't understand recursion are doomed to repeat it
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190419/25b489a5/attachment.html>


More information about the NumPy-Discussion mailing list