[Numpy-discussion] reseed random generator (1.19)

Robert Kern robert.kern at gmail.com
Mon Jun 29 11:30:00 EDT 2020


On Mon, Jun 29, 2020 at 11:10 AM Kevin Sheppard <kevin.k.sheppard at gmail.com>
wrote:

> It can be anything, but “good practice” is to use a number that would have
> 2 properties:
>
>
>
>    1. When expressed as binary number, it would have a large number of
>    both 0s and 1s
>
>
The properties of the SeedSequence algorithm render this irrelevant,
fortunately. While there are seed numbers that might create "bad" outputs
from SeedSequence with overly low or high Hamming weight (number of 1s),
they are scattered around the input space so you have to adversarially
reverse the SeedSequence algorithm to find them. IMO, the only reason to
avoid seed numbers like this has more to do with the fact that there are a
relatively small number of these seeds. If you are deliberately picking
from that small set somehow, it's more likely that other researchers are
too, and you are more likely to reuse that same seed.


>
>    1. The total number of digits in the binary representation is
>    somewhere between 32 and 128.
>
>
I like using the standard library `secrets` module.

>>> import secrets
>>> secrets.randbelow(1<<128)
8080125189471896523368405732926911908

If you want an easy-to-follow rule, just use the above snippet to get a
128-bit number. More than 128 bits won't do you any good (at least by
default, the internal bottleneck inside of SeedSequence is a 128-bit pool),
and 128-bit numbers are just about small enough to copy-paste comfortably.

We have thought about wrapping that up in a numpy.random function (e.g.
`np.random.simple_seed()` or something like that) for convenience, but we
wanted to wait a bit before commiting to an API.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200629/eab640cc/attachment.html>


More information about the NumPy-Discussion mailing list