[Numpy-discussion] reseed random generator (1.19)

Evgeni Burovski evgeny.burovskiy at gmail.com
Sat Jul 4 13:01:29 EDT 2020


Thanks Kevin, thanks Robert, this is very helpful!

I'd strongly agree with Matti that your explanations could/should make
it to the docs. Maybe it's something for the GSoD.

While we're on the subject, one comment and two (hopefully last) questions:

1. My two cents w.r.t. `np.random.simple_seed()` function Robert
mentioned: I personally would find it way more confusing than a clear
explanation + example in the docs. I'd ask myself what's "simple"
here, click through to the source of this `simple_seed`, find out that
it's a docsting and a two-liner, and just copy-paste the latter into
my user code. Again, just FWIW.

2. What would be a preferred way of spelling out "give me the N-th
spawned child SeedSequence"?
The use case is that I prepare (human-readable) input files once and
run a number of computational jobs in separate OS processes. From what
Kevin said, I can of course five each worker a pair of (entropy,
worker_id) and then each of them does at startup

> parent_seq = SeedSequence(entropy)
> this_sequence = seed_seq.spawn(worker_id)[worker_id]

Is this a recommended way, or is there a better API? Or does the
number of spawned children need to be known beforehand?
I'd much rather avoid serialization/deserialization if possible.

3. Is there a way of telling the number of draws a generator did?

The use case is to checkpoint the number of draws and `.advance` the
bit generator when resuming from the checkpoint. (The runs are longer
then the batch queue limits).

Thanks!

Evgeni

On Mon, Jun 29, 2020 at 11:06 PM Robert Kern <robert.kern at gmail.com> wrote:
>
> On Mon, Jun 29, 2020 at 11:30 AM Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Mon, Jun 29, 2020 at 11:10 AM Kevin Sheppard <kevin.k.sheppard at gmail.com> wrote:
>>>
>>> The total number of digits in the binary representation is somewhere between 32 and 128.
>>
>>
>> I like using the standard library `secrets` module.
>>
>> >>> import secrets
>> >>> secrets.randbelow(1<<128)
>> 8080125189471896523368405732926911908
>>
>> If you want an easy-to-follow rule, just use the above snippet to get a 128-bit number. More than 128 bits won't do you any good (at least by default, the internal bottleneck inside of SeedSequence is a 128-bit pool), and 128-bit numbers are just about small enough to copy-paste comfortably.
>
>
> Sorry, `secrets.randbits(128)` is the cleaner form of this.
>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


More information about the NumPy-Discussion mailing list