[Python-ideas] Python's Source of Randomness and the random.py module Redux
Nathaniel Smith
njs at pobox.com
Fri Sep 11 12:26:07 CEST 2015
On Fri, Sep 11, 2015 at 1:11 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 10 September 2015 at 23:46, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Sep 10, 2015, at 07:21, Donald Stufft <donald at stufft.io> wrote:
>>>
>>> Either we can change the default to a secure
>>> CSPRNG and break these functions (and the people using them) which is however
>>> easily fixed by changing ``import random`` to
>>> ``import random; random = random.DeterministicRandom()``
>>
>> But that isn't a fix, unless all your code is in a single module. If I call random.seed in game.py and then call random.choice in aiplayer.py, I'll get different results after your fix than I did before.
>
> Note that this is another case of wanting "correct by default".
> Requiring the user to pass around a RNG object makes it easy to do the
> wrong thing - because (as above) people can too easily create multiple
> independent RNGs by mistake, which means your numbers don't
> necessarily satisfy the randomness criteria any more.
Accidentally creating multiple independent RNGs is not going to cause
any problems with respect to randomness. It only creates a problem
with respect to determinism/reproducibility.
Beyond that I just find your message a bit baffling. I guess I believe
you that you find passing around RNG objects to make it easy to do the
wrong thing, but it's exactly the opposite of my experience: when
writing code that cares about determinism/reproducibility, then for
me, passing around RNG objects makes it way *easier* to get things
right. It makes it much more obvious what kinds of refactoring will
break reproducibility, and it enables all kinds of useful tricks.
E.g., keeping to the example of games and "aiplayer.py", a common
thing game designers want to do is to record playthroughs so they can
be replayed again as demos or whatever. And a common way to do that is
to (1) record the player's inputs, (2) make sure that the way the game
state evolves through time is deterministic given the players inputs.
(This isn't necessarily the *best* strategy, but it is a common one.)
Now suppose we're writing a game like this, and we have a bunch of
"enemies", each of whose behavior is partially random. So on each
"tick" we have to iterate through each enemy and update its state.
If we are using a single global RNG, then for correctness it becomes
crucial that we always iterate over all enemies in exactly the same
order. Which is a mess.
A better strategy is, keep one global RNG for the level, but then when
each new enemy is spawned, assign it its own RNG that will be used to
determine its actions, and seed this RNG using a value sampled from
the global RNG (!). Now the overall pattern of the game will be just
as random, still be deterministic, and -- crucially -- it no longer
matters what order we iterate over the enemies in.
I particularly would not want to use the global RNG in any program
that was complicated enough to involve multiple modules. Passing state
between inter-module calls using a global variable is pretty much
always a bad plan, and that's exactly what you're talking about here.
Non-deterministic global RNGs are fine, b/c they're semantically
stateless; it's exactly the cases where you care about the determinism
of the RNG state that you want to *stop* using the global RNG.
-n
--
Nathaniel J. Smith -- http://vorpus.org
More information about the Python-ideas
mailing list