[Python-ideas] Should our default random number generator be secure?
Robert Kern
robert.kern at gmail.com
Tue Sep 15 13:21:57 CEST 2015
On 2015-09-15 04:53, Steven D'Aprano wrote:
> On Mon, Sep 14, 2015 at 10:19:09PM +0100, Robert Kern wrote:
>
>> The requirement for a good PRNG for simulation work is that it be *well*
>> distributed in reasonable dimensions, not that it be *exactly*
>> equidistributed for some k. And well-distributedness is exactly what is
>> tested in TestU01. It is essentially a collection of simulations designed
>> to expose known statistical flaws in PRNGs. So to your earlier question as
>> to which is more damning, failing TestU01 or not being perfectly 623-dim
>> equidistributed, failing TestU01 is.
>
> I'm confused here. Isn't "well-distributed" a less-strict test than
> "exactly equidistributed"? MT is (almost) exactly k-equidistributed up
> to k = 623, correct? So how does it fail the "well-distributed" test?
k=623 is a tiny number of dimensions for testing "well-distributedness". You
should be able to draw millions of values without detecting significant
correlations.
Perfect k-dim equidistribution is not a particularly useful metric on its own
(at least for simulation work). You can't just say "PRNG A has a bigger k than
PRNG B therefore PRNG A is better". You need a minimum period to even possibly
reach a certain k, and that period goes up exponentially with k. Given two PRNGs
that have the same period, but one has a much smaller k than the other, *then*
you can start making inferences about relative quality (again for simulation
work; ChaCha20 has a long period but no guarantees of k that I am aware of, but
its claim to fame is security, not simulation work). Astronomical periods have
costs, so you only want to pay for what is actually worth it, so it's certainly
a good thing that the MT has a k near its upper bound. PRNGs with shorter, but
still roomy periods like 2**128 are not worse because they have necessarily
smaller ks.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-ideas
mailing list