[Numpy-discussion] overhauling numpy.random and randomgen

Stephan Hoyer shoyer at gmail.com
Thu Apr 18 14:23:12 EDT 2019


Matti, Kevin and Robert -- thanks for putting this together! I am very
excited about these long awaited improvements to numpy.random.

I have a number of concerns about the user facing API, starting with the
names "Random Generator" and "Base Random Number Generator," which I
suspect will be a source of confusion. In particular, the current docs seem
to use the term "random number generator" interchangeably for both.

Rather than "base RNG", what about calling these classes a "random source"
or "random stream"? In particular, I would suggest defining two Python
classes:
- np.random.Generator as a less redundant name for what is currently called
RandomGenerator
- np.random.Source or np.random.Stream as an abstract base class for what
are currently called "base RNGs"

Even if we don't yet provide an API for defining sources of randomness
outside of NumPy, a base class for sources of randomness is valuable
because it clearly defines the shared interface.

There are also a couple of convenience attributes in the user-facing API
that I would suggest refining:
- The "brng" attribute of RandomGenerator is not a very descriptive name. I
would prefer "stream" or "source", or the more explicit "base_rng" if we
stick with that term.
- I don't think we need the "generator" property on base RNG objects. It is
fine to require writing np.random.Generator(base) instead. Looking at the
implementation, .generator caches the RandomGenerator objects it creates on
the base RNG, which creates a reference cycle. Yes, Python can garbage
collect reference cycles, but this is still a muddled data model.

Finally, why do we expose the np.random.gen object? I thought part of the
idea with the new API was to avoid global mutable state.

On Thu, Apr 18, 2019 at 7:20 AM Matti Picus <matti.picus at gmail.com> wrote:

> Thanks to the work of Kevin Sheppard, Robert Kern and others, the branch
> to merge randomgen https://github.com/bashtage/randomgen into numpy is
> ready for final review.
>
> The branch is here https://github.com/numpy/numpy/pull/13163. It is
> fully backward compatible: numpy.random.mtrand,
> numpy.random.RandomState, and the various stateful distributions from
> RandomState available as numpy.random.* produce the same streams as the
> current versions. The branch is intended to implement NEP 19
> https://www.numpy.org/neps/nep-0019-rng-policy.html
>
>
> The biggest change is that now there are a variety of random number
> generators available
>
> https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/brng/index.html,
>
> and a class numpy.random.RandomGenerator that can produce all the
> distributions from RandomState. A RandomGenerator instance is provided
> for convenience as numpy.random.gen
>
>
> Additional enhancements
>
> https://6722-908607-gh.circle-artifacts.com/0/home/circleci/repo/doc/build/html/reference/random/new-or-different.html
> allow convenient use of the new constructs in CFFI, Numba, Ctypes, and
> Cython.
>
>
> There are a few things to address before merging:
>
> - Review the new constructs and other APIS
>
> - Decide which BRNGs to include in the first release
>
> - Check that your packages still work with the new implementations. You
> can do this by creating a new virtualenv and installing numpy via pip
> install git+https://github.com/mattip/numpy.git@randomgen
> <https://github.com/user/repo.git@branch>
>
>
> We will try to have a final video call about the branch during the
> upcoming meeting May 10-11, more details will follow once we schedule
> the call. The goal is to merge it for the upcoming 1.17 release.
>
>
> The expectation is that this first merge will be followed by
> implementation and documentation tweaks and improvements, but we hope to
> get the major pieces in place as much as possible now.
>
>
> Matti
>
>
> Notes:
>
>
> Sorry for the long urls, they link to the generated documentation from
> CI. They may not be available a few weeks from now.
>
> There is a tracking issue for further work related to the PR
> numpy.random https://github.com/numpy/numpy/issues/13164
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190418/cf026d57/attachment.html>


More information about the NumPy-Discussion mailing list