[Python-Dev] Re: [Python-checkins] python/dist/src/Lib random.py,
1.62, 1.63
Tim Peters
tim.peters at gmail.com
Mon Aug 30 21:45:17 CEST 2004
[rhettinger at users.sourceforge.net]
> Modified Files:
> random.py
> Log Message:
> Teach the random module about os.urandom().
>
...
> * Provide an alternate generator based on it.
...
> + _tofloat = 2.0 ** (-7*8) # converts 7 byte integers to floats
...
> +class HardwareRandom(Random):
...
> + def random(self):
...
> + return long(_hexlify(_urandom(7)), 16) * _tofloat
Feeding in more bits than actually fit in a float leads to bias due to
rounding. Here:
"""
import random
import math
import sys
def main(n, useHR):
from math import ldexp
if useHR:
get = random.HardwareRandom().random
else:
get = random.random
counts = [0, 0]
for i in xrange(n):
x = long(ldexp(get(), 53)) & 1
counts[x] += 1
print counts
expected = n / 2.0
chisq = (counts[0] - expected)**2 / expected + \
(counts[1] - expected)**2 / expected
print "chi square statistic, 1 df, =", chisq
n, useNR = map(int, sys.argv[1:])
main(n, useNR)
"""
Running with the Mersenne random gives comfortable chi-squared values
for the distribution of bit 2**-53:
C:\Code\python\PCbuild>python temp.py 100000 0
[50082, 49918]
chi square statistic, 1 df, = 0.26896
C:\Code\python\PCbuild>python temp.py 100000 0
[49913, 50087]
chi square statistic, 1 df, = 0.30276
C:\Code\python\PCbuild>python temp.py 100000 0
[50254, 49746]
chi square statistic, 1 df, = 2.58064
Running with HardwareRandom instead gives astronomically unlikely values:
C:\Code\python\PCbuild>python temp.py 100000 1
[52994, 47006]
chi square statistic, 1 df, = 358.56144
C:\Code\python\PCbuild>python temp.py 100000 1
[53097, 46903]
chi square statistic, 1 df, = 383.65636
C:\Code\python\PCbuild>python temp.py 100000 1
[53118, 46882]
chi square statistic, 1 df, = 388.87696
One way to repair that is to replace the computation with
return _ldexp(long(_hexlify(_urandom(7)), 16) >> 3, -BPF)
where _ldexp is math.ldexp (and BPF is already a module constant).
Of course that would also be biased on a box where C double had fewer
than BPF (53) bits of precision (but the Twister implementation would
show the same bias then).
More information about the Python-Dev
mailing list