[Python-ideas] Should our default random number generator be secure?

Tue Sep 15 13:41:30 CEST 2015

On Tue, Sep 15, 2015 at 1:45 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 15.09.2015 09:36, Nathaniel Smith wrote:
>>
>> [Using empirical tests to check RNGs]
>>
>> Obviously the thing the scientists worry about is a *strict* subset of
>> what the cryptographers are worried about.
>
> I think this explains why we cannot make ends meet:
>
> A scientist wants to be able to *repeat* a simulation in exactly the
> same way without having to store GBs of data (or send them to colleagues
> to have them very the results).
>
> Crypto RNGs cannot provide this feature per design.
>
> What people designing PRNGs are after is to improve the statistical
> properties of these PRNGs while still maintaining the repeatability
> of the output.
>
>> This is why it is silly to
>> worry that a crypto RNG will cause problems for a scientific
>> simulation. The cryptographers take the scientists' real goal -- the
>> correctness of arbitrary programs like e.g. a monte carlo simulation
>> -- *much* more seriously than the scientists themselves do. (This is
>> because scientists need RNGs to do their real work, whereas for
>> cryptographers RNGs are their real work.)
>
> Yes, cryptographers are the better folks, understood. These arguments
> are not really helpful. They are not even arguments.

Err... I think we're arguing past each other. (Hint: I'm a scientist,
not a cryptographer ;-).)

My email was *only* trying to clear up the argument that keeps popping
up about whether or not a cryptographic RNG could introduce bias in
simulations etc., as compared to the allegedly-better-behaved Mersenne
Twister. (As in e.g. your comment upthread that "[MT] is proven to be
equidistributed which is a key property needed for it to be used as
basis for other derived probability distributions".) This argument is
incorrect -- equidistribution is not a guarantee that an RNG will
produce good results when deriving other probability distributions,
and in general cryptographic RNGs will produce as-or-better results
than MT in terms of correctness of output. On this particular axis,
using a cryptographic RNG is not at all dangerous.

Obviously this is only one of the considerations in choosing an RNG;
the quality of the randomness is totally orthogonal to considerations
like determinism.

(Cryptographers also have deterministic RNGs -- they call them "stream
ciphers" -- and these will also meet or beat MT in any practically
relevant test of correctness for the same reasons I outlined, despite
not being provably equidistributed. Of course there are then yet other
trade-offs like speed. But that's not really relevant to this thread,
because no-one is proposing replacing MT as the standard deterministic
RNG in Python; I'm just trying to be clear about how one judges the
quality of randomness that an RNG produces.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org