[Numpy-discussion] overhauling numpy.random and randomgen Message-ID:

Neal Becker ndbecker2 at gmail.com
Fri Apr 19 08:16:01 EDT 2019


The boost_random c++ library uses the terms 'generators' and
'distributions'.  Distributions are applied to generators.

On Fri, Apr 19, 2019 at 7:54 AM Kevin Sheppard
<kevin.k.sheppard at gmail.com> wrote:
>
> >  Rather than "base RNG", what about calling these classes a "random source"
> or "random stream"? In particular, I would suggest defining two Python
> classes:
> > - np.random.Generator as a less redundant name for what is currently called
> RandomGenerator
> > - np.random.Source or np.random.Stream as an abstract base class for what
> are currently called "base RNGs"
>
> Naming is definitely hard.  Simple RNGs are currently called basic RNGs which was inspired by mkl-random. 'source' sounds OK to me, but sort of hides the fact that these are the actual Psuedo RNGs. `stream` has a technical meaning (a single PRNG make produce multiple independent streams) and IMO should be avoided since this might lead to confusion.  Perhaps source_rng (or in docs Source RNG)?
>
> RandomGenerator is actually RandomTransformer, but I didn't like the latter name.
>
> > There are also a couple of convenience attributes in the user-facing API
> that I would suggest refining:
> >   - The "brng" attribute of RandomGenerator is not a very descriptive name. I
> would prefer "stream" or "source", or the more explicit "base_rng" if we
> stick with that term.
> >   - I don't think we need the "generator" property on base RNG objects. It is
> fine to require writing np.random.Generator(base) instead. Looking at the
> implementation, .generator caches the RandomGenerator objects it creates on
> the base RNG, which creates a reference cycle. Yes, Python can garbage
> collect reference cycles, but this is still a muddled data model.
>
> The attribute name should match the final (descriptive) name, whatever it is.  In RandomGen I am using the `basic_rng` attribute name, but this could be `source`.  I also use a property so that the attribute can have a docstring attached for use in IPython. I think this is more user-friendly.
>
> I think dropping the `generator` property on the basic RNGs is reasonable.  It was a convenience but is awkward, and I always understood that it creates a cycle.
>
> > Finally, why do we expose the np.random.gen object? I thought part of the
> idea with the new API was to avoid global mutable state.
>
> Module level functions are essential for quick experiments and should be provided.  The only difference here is that the singleton `seed`  and `state` are no longer exposed so that it isn't possible (using the exposed API) to set the seed.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



-- 
Those who don't understand recursion are doomed to repeat it


More information about the NumPy-Discussion mailing list