[Python-ideas] Re: Better (?) PRNG

Nov. 15, 2022

      On Sat, Nov 05, 2022 at 01:37:30AM -0500, James Johnson wrote:
...
I wrote the attached python (3) code to improve on existing prng functions.
I used the time module for one method, which resulted in
disproportionate odd values, but agreeable means.
First the good news: your random number generator at least is well 
distributed. We can look at the mean and stdev, and it is about the same 
as the MT PRNG:
...
...
...
import random, time, statistics
def rand(n):
...     return int(time.time_ns() % n)
... 
data1 = [rand(10) for i in range(1000000)]
data2 = [random.randint(0, 9) for i in range(1000000)]
statistics.mean(data1)
4.483424
statistics.mean(data2)
4.498849
statistics.stdev(data1)
2.8723056046255744
statistics.stdev(data2)
2.8734388686467534
There's no real difference there.

But let's look at rising and falling sequences. Let's walk through the 
two runs of random digits, and take a +1 if the value goes up, -1 if it 
goes down, and 0 if it stays the same. With a *good quality* random 
sequence, one time in ten you should get the same value twice in a row 
(on average). And the number of positive steps and negative steps should 
be roughly the same. We can see that with the MT output:
...
...
...
steps_mt = []
for i in range(1, len(data2)):
...     a, b = data2[i-1], data2[i]
...     if a > b: steps_mt.append(1)
...     elif a < b: steps_mt.append(-1)
...     else: steps_mt.append(0)
... 
steps_mt.count(0)
99826
That's quite close to the expected 100,000 zeroes we would expect. And 
the +1s and -1s almost cancel each other out, with only a small 
imbalance:
...
...
...
sum(steps_mt)
431
The ratio of -ve steps to +ve steps should be 1, and in this sample we 
get 0.99904 which is pretty close to what we expect.

These results correspond to sample probabilities:

* Probability of getting the same as the previous number: 
  0.09983  (theorectical 0.1)
* Probability of getting a larger number than the previous:
  0.45030  (theoretical 0.45)
* Probability of getting a smaller number than the previous: 
  0.44987  (theoretical 0.45)

So pretty close, and it supports the claim that MT is a very good 
quality RNG.

But if we do the same calculation with your random function, we get 
this:
...
...
...
steps.count(0)
96146
sum(steps)
-82609
Wow! This gives us probabilities:

* Probability of getting the same as the previous number: 
  0.09615  (theorectical 0.1)
* Probability of getting a larger number than the previous:
  0.41062  (theoretical 0.45)
* Probability of getting a smaller number than the previous: 
  0.49323  (theoretical 0.45)

I ran the numbers again, with a bigger sample size (3 million instead of 
1 million) and the bias got much worse. The ratio of -ve steps to +ve 
steps, instead of being close to 1, was 1.21908. That's a 20% bias.

The bottom line here is that your random numbers, generated using the 
time, have strong correlations between values. And that makes it a much 
poorer choice for a PRNG.

-- 
Steve

[Python-ideas] Re: Better (?) PRNG

Steven D'Aprano