Testing random
Jonas Wielicki
jonas at wielicki.name
Sun Jun 7 06:41:53 EDT 2015
On 07.06.2015 08:27, Cecil Westerhof wrote:
> I wrote a very simple function to test random:
> def test_random(length, multiplier = 10000):
> number_list = length * [0]
> for i in range(length * multiplier):
> number_list[random.randint(0, length - 1)] += 1
> minimum = min(number_list)
> maximum = max(number_list)
> return (minimum, maximum, minimum / maximum)
>
> When running:
> for i in range(1, 7):
> print(test_random(100, 10 ** i))
> I get:
> (3, 17, 0.17647058823529413)
> (73, 127, 0.5748031496062992)
> (915, 1086, 0.8425414364640884)
> (9723, 10195, 0.9537027954879843)
> (99348, 100620, 0.9873583780560524)
> (997198, 1002496, 0.9947151908835546)
>
> and when running:
> for i in range(1, 7):
> print(test_random(1000, 10 ** i))
> I get:
> (2, 20, 0.1)
> (69, 138, 0.5)
> (908, 1098, 0.8269581056466302)
> (9684, 10268, 0.9431242695753799)
> (99046, 100979, 0.9808574059953059)
> (996923, 1003598, 0.9933489305478886)
>
> It shows that when the first parameter increases the deviation
> increases and when the second parameter increases the deviation
> decreases. Exactly what you would expect. But what are the ranges you
> would expect with a good random function.
Really depends on the number of samples. Appearantly, a good RNG would
come close to 1.0 for the ratio.
> Then it could be used to test a random function.
This is an interesting test (more interesting to me than it looks at the
first sight, and certainly better than what I had come up with), but
unfortunately, there is more to testing randomness.
The test clearly suggests that random functions should have a min/max
ratio of about 1.0. Think of a "random" function like this:
def fake_random(minv, maxv, _cache={}):
try:
return next(_cache[minv, maxv])
except (StopIteration, KeyError):
iterator = iter(range(minv, maxv+1))
_cache[minv, maxv] = iterator
return next(iterator)
(if that is hard to read, I agree; it returns the sequence from minv to
maxv (inclusively) over and over again for equal minv and maxv. don’t
pass anything to _cache :), just call it like random.randint)
This gives a "perfect" ratio of 1.0, but is not very random. This kind
of function would probably be ruled out by a compression or correlation
test. If you want to dig deeper into the algorithms for random testing,
the dieharder test suite [1] is probably worth a look.
It all boils down to the use case. For some use cases, the
``fake_random`` might be good enough (unittesting would be such a case:
it is certainly uniformly distributed and allows full coverage of the
tested range), for others it would fail catastrophically (don’t generate
your cryptographic keys with this!).
cheers,
Jonas
[1]: http://www.phy.duke.edu/~rgb/General/dieharder.php
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20150607/9fa6160b/attachment.sig>
More information about the Python-list
mailing list