[Numpy-discussion] Backwards-incompatible improvements to numpy.random.RandomState

Ralf Gommers ralf.gommers at gmail.com
Sun May 24 04:59:49 EDT 2015


On Sun, May 24, 2015 at 10:22 AM, Antony Lee <antony.lee at berkeley.edu>
wrote:

> Hi,
>
> As mentioned in
>
> #1450: Patch with Ziggurat method for Normal distribution
> #5158: ENH: More efficient algorithm for unweighted random choice without
> replacement
> #5299: using `random.choice` to sample integers in a large range
> #5851: Bug in np.random.dirichlet for small alpha parameters
>
> some methods on np.random.RandomState are implemented either non-optimally
> (#1450, #5158, #5299) or have outright bugs (#5851), but cannot be easily
> changed due to backwards compatibility concerns.  While some have suggested
> new methods deprecating the old ones (see e.g. #5872), some consensus has
> formed around the following ideas (see #5299 for original discussion,
> followed by private discussions with @njsmith):
>
> - Backwards compatibility should only be provided to those who were
> explicitly instantiating a seeded RandomState object or reseeding a
> RandomState object to a given value, and drawing variates from it: using
> the global methods (or a None-seeded RandomState) was already
> non-reproducible anyways as e.g. other libraries could be drawing variates
> from the global RandomState (of which the free functions in np.random are
> actually methods).  Thus, the global RandomState object should use the
> latest implementation of the methods.
>

The rest of the proposal looks good to me, but the reasoning on this point
is shaky. np.random.seed() is *very* widely used, and works fine for a test
suite where each test that needs random numbers calls seed(...) and is run
with nose. Can you explain why you need to touch the behavior of the global
methods in order to make RandomState(version=) work?

Ralf


- "RandomState(seed)" and "r = RandomState(...); r.seed(seed)" should offer
> backwards-compatibility guarantees (see e.g.
> https://docs.python.org/3.4/library/random.html#notes-on-reproducibility).
>
> As such, we propose the following improvements to the API:
>
> - RandomState gains a (keyword-only) parameter, "version", also accessible
> as a read-only attribute.  This indicates the version of the methods on the
> object.  The current version of RandomState is retroactively assigned
> version 0.  The latest available version is available as
> np.random.LATEST_VERSION.  Backwards-incompatible improvements to
> RandomState methods can be introduced but increase the LAGTEST_VERSION.
>
> - The global RandomState is instantiated as
> RandomState(version=LATEST_VERSION).
>
> - RandomState() and rs.seed() sets the version to LATEST_VERSION.
>
> - RandomState(seed[!=None]) and rs.seed(seed[!=None]) sets the version to
> 0.
>
> A proof-of-concept implementation, still missing tests, is tracked as
> #5911.  It includes the patch proposed in #5158 as an example of how to
> include an improved version of random.choice.
>
> Comments, and help for writing tests (in particular to make sure backwards
> compatibility is maintained) are welcome.
>
> Antony Lee
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/94793c2b/attachment.html>


More information about the NumPy-Discussion mailing list