[Numpy-discussion] overhauling numpy.random and randomgen Message-ID:

Kevin Sheppard kevin.k.sheppard at gmail.com
Fri Apr 19 07:54:09 EDT 2019


>  Rather than "base RNG", what about calling these classes a "random
source"
or "random stream"? In particular, I would suggest defining two Python
classes:
> - np.random.Generator as a less redundant name for what is currently
called
RandomGenerator
> - np.random.Source or np.random.Stream as an abstract base class for what
are currently called "base RNGs"

Naming is definitely hard.  Simple RNGs are currently called basic RNGs
which was inspired by mkl-random. 'source' sounds OK to me, but sort of
hides the fact that these are the actual Psuedo RNGs. `stream` has a
technical meaning (a single PRNG make produce multiple independent streams)
and IMO should be avoided since this might lead to confusion.  Perhaps
source_rng (or in docs Source RNG)?

RandomGenerator is actually RandomTransformer, but I didn't like the latter
name.

> There are also a couple of convenience attributes in the user-facing API
that I would suggest refining:
>   - The "brng" attribute of RandomGenerator is not a very descriptive
name. I
would prefer "stream" or "source", or the more explicit "base_rng" if we
stick with that term.
>   - I don't think we need the "generator" property on base RNG objects.
It is
fine to require writing np.random.Generator(base) instead. Looking at the
implementation, .generator caches the RandomGenerator objects it creates on
the base RNG, which creates a reference cycle. Yes, Python can garbage
collect reference cycles, but this is still a muddled data model.

The attribute name should match the final (descriptive) name, whatever it
is.  In RandomGen I am using the `basic_rng` attribute name, but this could
be `source`.  I also use a property so that the attribute can have a
docstring attached for use in IPython. I think this is more user-friendly.

I think dropping the `generator` property on the basic RNGs is reasonable.
It was a convenience but is awkward, and I always understood that it
creates a cycle.

> Finally, why do we expose the np.random.gen object? I thought part of the
idea with the new API was to avoid global mutable state.

Module level functions are essential for quick experiments and should be
provided.  The only difference here is that the singleton `seed`  and
`state` are no longer exposed so that it isn't possible (using the exposed
API) to set the seed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190419/5ce7fccd/attachment.html>


More information about the NumPy-Discussion mailing list