
I wrote the attached python (3) code to improve on existing prng functions. I used the time module for one method, which resulted in disproportionate odd values, but agreeable means. I used the hashlib module for the second. It is evident that the code is amateur, but the program might result in better PRN generation. The "app" lends itself to checking, using statistical tools (see comments.) I remain a fan, James Johnson

In concept, what James suggests, is similar to the Random123 library, written at D.E.Shaw Research by my sadly late colleague John Salmon. See https://github.com/DEShawResearch/random123 and the linked academic publications. E.g. https://dl.acm.org/doi/10.1145/2063384.2063405 In general, all PRNGs are deterministic, and by relying on a known seed, the Nth element in a sequence of random numbers can always be reconstructructed. However, if a large number of random numbers are used, certain replication scenarios make the purely sequential nature of generators like Mersenne Twister or linear congruential generators inconvenient. Counter based pseudo-random generators like Random123 use cryptographic transformations upon counter variables, and were rigorously shown to pass all standard tests of randomness of distribution. However, these tests were done using AES, Threefish, and Philox, and do not automatically apply to SHA256 that James uses. The advantage of these is that they allow direct construction of the Nth element in a pseudo-random sequence without large memory or CPU usage needed to construct the N-1 prior elements. That said, Random123 is *done right* by a number of very brilliant people who performed rigorous testing. Something that "seems kinda similar" may or may not be of similar quality (more likely not). FWIW, AES is built-in as a primitive instructruction on most modern general-purpose CPUs, so can be very fast to perform. On Mon, Nov 14, 2022 at 10:56 AM Barry <barry@barrys-emacs.org> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Tue, 15 Nov 2022 at 03:16, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
They don't *automatically* apply, but SHA256 exhibits the avalanche effect that all good cryptographic hashes do, so I would expect that it would have the appropriate distribution. But that would be assuming you're doing something like this: def seed(input): global _RANDOM_SEED, _RANDOM_COUNTER _RANDOM_SEED = hashlib.sha256(str(input).encode("utf-8")) _RANDOM_COUNTER = itertools.count() _seed(time.time()) # or some other default source def random(): hash = _RANDOM_SEED.copy() hash.update(b"%d" % next(_RANDOM_COUNTER)) return int.from_bytes(hash.digest()) / (1<<256) I can't see this happening in the OP's code. And even if it were, I'm not sure how much it would buy you over the current system of random number generation.
That said, Random123 is *done right* by a number of very brilliant people who performed rigorous testing. Something that "seems kinda similar" may or may not be of similar quality (more likely not). FWIW, AES is built-in as a primitive instructruction on most modern general-purpose CPUs, so can be very fast to perform.
Hashing in general is pretty fast on GPUs, so that may or may not actually be significant. But unless someone's actually complaining about the performance of an RNG, I would be more concerned with its features than its CPU cost, including: 1) Reproducibility, if desired (it sometimes is and sometimes isn't) 2) Appropriate distribution and sequence unpredictability 3) Avoidance of side-channel discoverability, particularly if reproducible etc. One thing that IS interesting, when exploring weird and probably suboptimal RNG algorithms, is finding things that can be done to get "kinda okay" randomness when you don't really have a good source of entropy. Let's say you're running a tiny embedded device that needs to report to an HTTPS server with information. How is it to ensure that its SSL sessions are properly secure? Assuming it doesn't have any sort of actual on-board entropy hardware, what's it to do? Sometimes, an imperfect solution is still better than nothing. ChrisA

I should add, as does the Python documentation, that if you want genuinely non-reproducible random values, the `secrets` module exists and produces the best results possible on a given OS and computer architecture. On Unix-like systems, /dev/random is the best source of entropy you're going to find, and trying to find your own is certainly going to be less good than that source which `secrets` utilizes. On Mon, Nov 14, 2022 at 11:13 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Tue, 15 Nov 2022 at 00:14, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
There's another possibility that you haven't explored. You are only looking at random number generators that produce a linear sequence of numbers. If you add a 'split' function to your generator, that takes one generator and returns two generators that are independent of each other, you can build trees of random numbers, instead of just linear sequences. Those trees also allow parallelisation. (Implementations should take care to ensure that the resulting generators are not correlated. ) You can combine the counter-based approach and the split based approach, of course. If you have a cryptographic hash function, it's relatively easy to give a toy implementation.

For background, Random123 was developed for a supercomputer that does molecular dynamic simulations. In particular, for the Anton supercomputer, complete reproducibility of simulations was/is an important constraint. In concept, in that context, you might want to "jump to timestamp 1 billion, and run forward from there" (based on a saved snapshot of the chemical system at the timestamp). In simulations, a certain element of "reproducible randomness" is relevant to inject. For that specific purpose, the splitting of generators that Matthias mentions isn't especially helpful. But for other purposes, it absolutely is. And yes, most certainly these two techniques could be complementary, they need not be mutually exclusive. On Tue, Dec 6, 2022 at 8:45 PM Matthias Görgens <matthias.goergens@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

https://docs.python.org/3/library/random.html :
https://docs.python.org/3/library/secrets.html#module-secrets PEP 506 – Adding A Secrets Module To The Standard Library https://peps.python.org/pep-0506/#alternatives https://github.com/python/peps/blob/main/pep-0506.txt PEP 12: new PEP template: https://github.com/python/peps/blob/main/pep-0012/pep-NNNN.rst Pseudorandom number generator > Cryptographic PRNGs https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Cryptographic_PR... Random number generator attack > Defenses https://en.wikipedia.org/wiki/Random_number_generator_attack#Defenses /? CSPRNG https://www.google.com/search?q=CSPRNG From "THE LINUX CSPRNG IS NOW GOOD!" https://words.filippo.io/dispatches/linux-csprng/ :
[ get random() is from OpenBSD and LibreSSL ]
"Problems emerge for a unified /dev/*random" (2022) https://lwn.net/Articles/889452/ From https://www.redhat.com/en/blog/understanding-red-hat-enterprise-linux-random... : """ How does the kernel initialize its CSPRNG? The kernel has an “entropy pool,” a place where unpredictable input observed by the kernel is mixed and stored. That pool serves as a seed to the internal CSPRNG, and until some threshold of estimated entropy is reached initially, it is considered uninitialized. Let’s now see how the kernel initializes its entropy pool. 1. After the kernel takes control on power-on, it starts filling its entropy pool by mixing interrupt timing and other unpredictable input. 2. The kernel gives control to systemd. 3. Next, systemd starts and initializes itself. 4. Systemd, optionally, loads kernel modules which will improve the kernel's entropy gathering process on a virtual machine (e.g., virtio-rng). 5. Systemd loads the rngd.service which will gather additional input entropy obtained via a random generator exposed by hardware (e.g., the x86 RDRAND instruction or similar) and jitter entropy1; this entropy is fed back into the kernel to initialize its entropy pool, typically in a matter of milliseconds. After the last step, the kernel has its entropy pool initialized, and any systemd services started can take advantage of the kernel’s random generator. Note that the virtio-rng kernel module loading in step (3), is an optional step which improves entropy gathering in a virtual machine by using the host's random generator to initialize the guest systems in KVM. The rngd.service loading at the final step (4) is what ensures that the kernel entropy pools are initialized on every scenario, and furthermore it continues mixing additional data in the kernel pool during system runtime. """ https://github.com/nhorman/rng-tools/blob/master/fips.c : ```c /* fips.c -- Performs FIPS 140-1/140-2 RNG tests ``` /? FIPS 140-1/140-2 RNG tests https://www.google.com/search?q=FIPS+140-1%2F140-2+RNG+tests /? CMVP "cprng" https://www.google.com/search?q=CMVP+%22cprng%22 https://csrc.nist.gov/publications/detail/fips/140/3/final https://www.google.com/search?q=rng+tests - https://www.johndcook.com/blog/rng-testing/ :
We test RNGs using the standard test suites: PractRand, TestU01 (BigCrush), DIEHARD(ER), NIST SP 800-22.
Randomness tests: https://en.wikipedia.org/wiki/Randomness_test#Notable_software_implementatio... : - https://en.wikipedia.org/wiki/Diehard_tests - https://en.wikipedia.org/wiki/TestU01 - /? NIST 800-22 https://www.google.com/search?q=nist+800-22 /? nist 800-22 site:github.com https://www.google.com/search?q=nist+800-22+site%3Agithub.com - https://github.com/google/paranoid_crypto/blob/main/docs/randomness_tests.md From https://cryptography.io/en/latest/random-numbers/ https://github.com/pyca/cryptography/blob/main/docs/random-numbers.rst : ```rst Random number generation ======================== When generating random data for use in cryptographic operations, such as an initialization vector for encryption in :class:`~cryptography.hazmat.primitives.ciphers.modes.CBC` mode, you do not want to use the standard :mod:`random` module APIs. This is because they do not provide a cryptographically secure random number generator, which can result in major security issues depending on the algorithms in use. Therefore, it is our recommendation to `always use your operating system's provided random number generator`_, which is available as :func:`os.urandom`. For example, if you need 16 bytes of random data for an initialization vector, you can obtain them with: .. doctest:: >>> import os >>> iv = os.urandom(16) This will use ``/dev/urandom`` on UNIX platforms, and ``CryptGenRandom`` on Windows. If you need your random number as an integer (for example, for :meth:`~cryptography.x509.CertificateBuilder.serial_number`), you can use ``int.from_bytes`` to convert the result of ``os.urandom``: .. code-block:: pycon >>> serial = int.from_bytes(os.urandom(20), byteorder="big") In addition, the `Python standard library`_ includes the ``secrets`` module, which can be used for generating cryptographically secure random numbers, with specific helpers for text-based formats. .. _`always use your operating system's provided random number generator`: https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ .. _`Python standard library`: https://docs.python.org/3/library/secrets.html ``` On Mon, Nov 14, 2022, 10:57 AM Barry <barry@barrys-emacs.org> wrote:

QRNG "Quantum Random Number Generation" -> Hardware random number generator
Physical phenomena with random properties > Quantum random properties https://en.wikipedia.org/wiki/Hardware_random_number_generator#Quantum_rando...
FWIW, SciPy and SymPy have various non-CSPRNG random distributions: - https://docs.scipy.org/doc/scipy/reference/stats.html - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.htm... - https://docs.sympy.org/latest/modules/stats.html - https://docs.sympy.org/latest/modules/stats.html#sympy.stats.Uniform - https://docs.sympy.org/latest/modules/stats.html#sympy.stats.DiscreteUniform "RFC 8937: Randomness Improvements for Security Protocols" (2020) https://www.rfc-editor.org/rfc/rfc8937.html *** FWIW, UUIDs and W3C DIDs require entropy e.g. from a CSPRNG or better: https://www.w3.org/TR/did-core/#terminology :
On Mon, Nov 14, 2022, 11:54 AM Wes Turner <wes.turner@gmail.com> wrote:

Thank you for replying with such specific assistance. I am made acutely aware that I am only a Python enthusiast, and not an academic. Hashes are deterministic, not random, but byte by byte, they can be very random. Please accept the attached script as a "hack," that might be novel, or a curiosity. My first script was defective several ways. In this, I seed the hash by hashing the uu-8 ENCODED string value of the time - it is not random, but it is rarely the same twice. I used 0-9 because it is decimal, and 0-59, because hours and minutes come in 60's. I found that taking time.time_nc() gave me far more odds than evens. I do not know how the scheme I devised of adding hex digits together would scale to larger ranges - it is done by adding hex digits that vary from 0 - 15 repeatedly, but, since the number of hex digits varies with the range, it could actually work for larger ranges. The random function in Python is not really adequate for a magic eight ball program, so as a consumer I am dissatisfied without examining it at a quantum level, for repeated reference thousands of times each. Thank you for your kind attention, and I hope the (much improved) hack can be helpful to other enthusiasts. James J (A.A. Faulkner University) On Mon, Nov 14, 2022 at 11:23 AM Wes Turner <wes.turner@gmail.com> wrote:

While FWIU it is advisable to keep seeding an RNG after startup - that's what e.g. rngd does - the cryptography.io docs do advise to just use `os.urandom()` (which is not the same as random.SystemRandom()?). There probably should be better default random in CPython; though I'm personally not at all qualified to assess, TIL about NIST 800-22 and FIPS 140-2 for evaluating sources of entropy. Differential entropy > Differential entropies for various distributions https://en.wikipedia.org/wiki/Differential_entropy *** (TIL that quantum information is never destroyed; so which physical processes are actually nonreversible like hashing and RNG are supposed to be is up for consideration as classical information is a subset of quantum information. Do black holes shift gamma radiation may actually be relevant! Perhaps a permanent magnet under a (double-jointed?) bar and ball pendulum on a magnetic bearing would produce enough Brownian motion to get enough uniform random to call it sufficiently entropic for purposes of CSPRNG? Nondeterministic NDE fluid calculations diffract into many possible outcomes, but a brute force combinatorial search of all the possible return values is still deterministically ordered. And the quantum computing folks are working on increasing coherence / reducing error propagation; like ECC.) - [ ] DOC: The docs could advise regarding which decent enough open source hw RNG would be supported by os.urandom or random.SystemRandom if rngd is not configured to keep nondeterministically seeding with entropy that should presumably be from a Uniform random entropic process There should be better software random in python. On Tue, Nov 15, 2022, 1:19 AM James Johnson <jj126979@gmail.com> wrote:

I want to be good natured about it, not combative. I tried whatever random function I found in documentation online, and used it. Without analyzing the world, I was simply dissatisfied with the results. Here’s what I settled on. I actually “put my thumb on the scales,” to rule out repetitions for seven questions, so randomness didn’t ultimately answer my question. I’m not sure that the function I “got” was the Mersenne Twister. I do know Mersenne was a truly great mathematician. I acknowledge that updating with time.asctime() EVERY time I update the hash would be random; I’m not qualified to know if it would be “more” random. After you ask about the bell curve v square distributions, I am quite out of my depth. Here’s my “wizard” for anyone who wants to sell him cryptocurrency or overcome his objections to the environmentalist agenda, or query him about foreign policy. I hope you find it better than average. https://drive.google.com/file/d/1EqQsMfBHDrNpOBQrI7CJxrpbowvKPD2G/ Regards, James J On Tue, Nov 15, 2022 at 4:31 AM Wes Turner <wes.turner@gmail.com> wrote:

On Tue, 15 Nov 2022 at 21:47, James Johnson <jj126979@gmail.com> wrote:
I tried whatever random function I found in documentation online, and used it. Without analyzing the world, I was simply dissatisfied with the results.
Can you explain what you were dissatisfied by? It's worth noting that, for a lot of purposes, people don't WANT true randomness (eg your choice to reject duplicates, which is a violation of randomness, but will make things more satisfying for humans). But that's completely different from trying to come up with a better RNG. ChrisA

James Johnson writes:
On the Python lists, you'll get much better discussion if you're specific about your "dissatisfaction". (If it's just the repetition problem, that's better handled by composing with a filter than by altering the PRNG itself, which should be as close to uniform as possible to make generating other distributions as simple as possible.) Your method *as described* is unclear. Don Knuth tried something similar (you have better tools available than he did, but the basic idea seems to be similarly "let's compose a few pseudo-randomizing transformations"), and got an embarrassing result. (Spoiler in footnote [1] if you don't feel like digging through Seminumerical Algorithms.) You may have had more method than Knuth did (cf Dave Mertz's statement that it "looks like Random123"), but again, it would really help communication if you described why you think your PRNG will do better than Python's current one. It's a heavily researched field for many decades now, and there are quite a few core devs with a lot of knowledge and experience in the field (Python apps provide quite of bit of the "attack surface" exposed on the Internet, it would be irresponsible if there weren't!) By the way, nobody here is going to laugh if it turns out that your thinking was naive. My point is to communicate more effectively, not that your idea is bad (despite having a copy of Seminumerical Algorithms, I'm not competent to evaluate your algorithm itself :-). How you feel about "getting schooled" is a personal thing, but my personal experience has been that there's an excellent education to be had by posting naive ideas to this list. :-) Sincere regards, N.B. Deliberately not trimming so you have to work to see the spoiler. :-)
Footnotes: [1] He composed several pseudo-random transformations, mostly linear congruential but also including at least one nonlinear one (middle square). The resulting sequence always converged to a short cycle within a handful of iterations.

I am an amateur enthusiast. The random function I found in documentation was in 2014 or so. I do not have the vocabulary to differentiate between various prng's. I only knew of one, and I it returned some version of the bell curve, with three favorite values, and one of THEM a pronounced preference. The scholars here are referencing white papers and using hardware and software interchangeably (?) I am not a randomization expert, and I do not seek academic recognition. It's a novel hack is all, and I don't know how to invoke the quality algorithms you reference. Chris A. asked me to "post" a module. I'm pretty sure he means on git-hub, and I don't know how to use it. I attach a script with a randbelow() function. The kernel of my idea is to use randomness from a hash value, so I declared the hash globally, and called it in the function. I know this may not be correct, but I don't know how to declare one momentarily and then destruct it or release the ram the next minute, between random numbers. Normally, I would update the hexdigest outside the function, and pass it to randbelow(range, hexdigest_string) as two arguments. Chris A wants it to feed an analysis program that may not agree well with the "permanence" of a hash between random numbers. I am willing to put in effort, but it's a steep learning curve for me. I DO understand that my first script was a lot of data entry, data checking and elementary reporting that was not germane. I use my fast computer off-line, and connect to the web with a Chromebook, so online tools would mean changing my setup. I remain interested to help and learn. I hope this is not too bare-bones. On Tue, Nov 15, 2022 at 7:17 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:

On Wed, 16 Nov 2022 at 00:59, James Johnson <jj126979@gmail.com> wrote:
Chris A. asked me to "post" a module. I'm pretty sure he means on git-hub, and I don't know how to use it. I attach a script with a randbelow() function.
No no, here is absolutely fine :) Although the code you posted doesn't run. There are a couple of syntactic errors, and after that, a NameError that I'm not sure how to fix. Can you make sure the code runs please? ChrisA

"Do hash functions such as SHA-2 and SHA-3 output random numbers, and are they therefore good P/RNGs?" https://www.quora.com/Do-hash-functions-such-as-SHA-2-and-SHA-3-output-rando... - ```quote See NIST.SP.800–90A. It details how to use both hash functions and block ciphers as PRNGs. When used properly (see NIST publication), hash functions are perfectly suitable as PRNGs and are used that way frequently. ``` - NIST SP 800-90A Rev. 1 "Recommendation for Random Number Generation Using Deterministic Random Bit Generators" https://csrc.nist.gov/publications/detail/sp/800-90a/rev-1/final PDF: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90Ar1.pdf Keywords: Deterministic random bit generator (DRBG); entropy; hash function; random number generator. - FWIW, {blockchain xyz(*) SHA-256}'s block hash digits do appear to be conditionally independent in hexadecimal (radix 16) with tx messages and nonces for inputs. - (*) informal study of XBT block hashes - ```quote There are a number of reasons why hash functions should not be used as RNGs, and hash Merkle-Dmagård constructions (including SHA1 and SHA2) should not be used directly for key derivation. That is, something like SHA2(salt+secret) should not be used to generate a key. Instead if you want to use SHA2 to derive a key, you should use it within an HMAC construction, or better still HKDF as a key derivation function (KDF). SHA3 is different. One of the design criteria for the SHA3 is that not have that limitation of its predecessors. So in principle it can be used that way. However, you still should roll your own KDF from SHA3. Instead you should use a professionally designed and implementation KDF. Internally, a KDF that uses SHA3 will be more direct in its use of the hash function than one designed to handle the limitations of SHA2, if you do try to roll your own you are less likely to shoot yourself in the foot with SHA3. But if you are asking the question you are asking, you definitely should not be rolling your own. Now perhaps you weren’t thinking about key derivation. Perhaps your question is more theoretical. In a his case it is important to understand that among other things, a secure RNG must keep its internal state secret. That way, revealing one key does not reveal information about other keys. Hash functions are not really designed with that in mind. (Well, SHA3 is, but it is still not designed to be an RNG.) SHA2 would make a terrible random number generator. SHA3 comes a lot closer, but still isn’t build for the purpose. ``` On Tue, Nov 15, 2022 at 8:58 AM James Johnson <jj126979@gmail.com> wrote:

I composed in VI, and the tabs were a problem. I apologize; there is no good excuse not to have called the function before attaching it. I had a logic problem. Accumulating the decimal conversion of hex needed to be modular divided to be within range, and I had the name of the hex string (pool) wrong. It runs as a program now. From quickly testing large ranges, it hovers near the mean, but on 10 and 60, it seemed to do ok. I am far less optimistic now. James On Tue, Nov 15, 2022 at 7:56 AM James Johnson <jj126979@gmail.com> wrote:

James Johnson writes:
I'm no expert either, but hash functions and PRNGs have been around a long time. I'm sure this approach has been discovered independently. Besides the Mersenne twister that Python uses, POSIX systems like Linux, MacOS, and BSD provide a number of random number generation facilities, briefly documented in the man pages. BSD man pages for random and arc4random mention the Fortuna algorithm and the arc4 algorithm. At least the latter is similar to yours in that in "mixes in" additional entropy gradually as more random numbers are generated. For sources of "entropy" (random bits) the man page on /dev/random and /dev/urandom provide nice brief explanations.

These are very many reasons that Python PRNG is indeed adequate. Maybe documentation for amateurs and newbies could be improved so that they get the best results possible without knowing how much work it is. How to call the built in PRNG seems to change a good bit. As for the hack: I feel obligated to connect to the internet and do research before I speak further. IF it DOES do 0-9 randomly, even if it is not useful for larger ranges, it may serve by CONCATENATING base 10? I feel like it’s the punchline to a lame joke now, but thank you all again. I will be reading for weeks. On Wed, Nov 16, 2022 at 9:36 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:

On Tue, Nov 15, 2022 at 12:19:48AM -0600, James Johnson wrote:
The random function in Python is not really adequate for a magic eight ball program,
That is an astonishing claim for you to assert without evidence. Python's PRNG is the Mersenne Twister, which is one of the most heavily studied and best PRNGs known. Yes, it's a bit old now, but it still gets the job done, and is more than adequate for non-security related randomness, including magic eight ball programs. There are PRNGs which are faster, or use less memory, or have better statistical properties, but not usually all at once. The MT is the work-horse of modern PRNGs. What properties do you think it lacks which are necessary for a magic eight ball program? Arguably, the Mersenne Twister is hugely overkill for a M8B program. It has a period of at least 2**19937-1, or about 10**6001 (that's a 1 followed by 6001 zeroes). Which means that if you ran your M8B a billion billion billion times a second, it would take more than a billion billion billion [five thousand more billions] billion years before the cycle of responses started again. Of course there are cleverer ways of predicting the output of a Mersenne Twister than waiting for the cycle to repeat, but it's a magic eight ball. Who cares? Even a 32-bit linear congruential generator, so long as it is not utterly dire, would do the job. -- Steve

On Tue, 15 Nov 2022 at 01:34, James Johnson <jj126979@gmail.com> wrote:
I wrote the attached python (3) code to improve on existing prng functions. I used the time module for one method, which resulted in disproportionate odd values, but agreeable means.
Current time of day is NOT random, and the low bits depend wildly on the available computer clock. Many (all?) computer clocks today do not have actual nanosecond resolution, so taking just the low bits in this way is going to give badly skewed values.
I used the hashlib module for the second. It is evident that the code is amateur, but the program might result in better PRN generation.
Hashing is a known way to improve the distribution of random numbers, but it doesn't add entropy. You need to first have actual entropy to work with.
The "app" lends itself to checking, using statistical tools (see comments.)
Averages are only one possible test for random distribution. Here are a bunch more: https://calmcode.io/blog/inverse-turing-test.html Try your random number generation there, and see how random it truly is. In what way is your code better than current? ChrisA

On Sat, Nov 05, 2022 at 01:37:30AM -0500, James Johnson wrote:
First the good news: your random number generator at least is well distributed. We can look at the mean and stdev, and it is about the same as the MT PRNG:
There's no real difference there. But let's look at rising and falling sequences. Let's walk through the two runs of random digits, and take a +1 if the value goes up, -1 if it goes down, and 0 if it stays the same. With a *good quality* random sequence, one time in ten you should get the same value twice in a row (on average). And the number of positive steps and negative steps should be roughly the same. We can see that with the MT output:
That's quite close to the expected 100,000 zeroes we would expect. And the +1s and -1s almost cancel each other out, with only a small imbalance:
sum(steps_mt) 431
The ratio of -ve steps to +ve steps should be 1, and in this sample we get 0.99904 which is pretty close to what we expect. These results correspond to sample probabilities: * Probability of getting the same as the previous number: 0.09983 (theorectical 0.1) * Probability of getting a larger number than the previous: 0.45030 (theoretical 0.45) * Probability of getting a smaller number than the previous: 0.44987 (theoretical 0.45) So pretty close, and it supports the claim that MT is a very good quality RNG. But if we do the same calculation with your random function, we get this:
Wow! This gives us probabilities: * Probability of getting the same as the previous number: 0.09615 (theorectical 0.1) * Probability of getting a larger number than the previous: 0.41062 (theoretical 0.45) * Probability of getting a smaller number than the previous: 0.49323 (theoretical 0.45) I ran the numbers again, with a bigger sample size (3 million instead of 1 million) and the bias got much worse. The ratio of -ve steps to +ve steps, instead of being close to 1, was 1.21908. That's a 20% bias. The bottom line here is that your random numbers, generated using the time, have strong correlations between values. And that makes it a much poorer choice for a PRNG. -- Steve

In concept, what James suggests, is similar to the Random123 library, written at D.E.Shaw Research by my sadly late colleague John Salmon. See https://github.com/DEShawResearch/random123 and the linked academic publications. E.g. https://dl.acm.org/doi/10.1145/2063384.2063405 In general, all PRNGs are deterministic, and by relying on a known seed, the Nth element in a sequence of random numbers can always be reconstructructed. However, if a large number of random numbers are used, certain replication scenarios make the purely sequential nature of generators like Mersenne Twister or linear congruential generators inconvenient. Counter based pseudo-random generators like Random123 use cryptographic transformations upon counter variables, and were rigorously shown to pass all standard tests of randomness of distribution. However, these tests were done using AES, Threefish, and Philox, and do not automatically apply to SHA256 that James uses. The advantage of these is that they allow direct construction of the Nth element in a pseudo-random sequence without large memory or CPU usage needed to construct the N-1 prior elements. That said, Random123 is *done right* by a number of very brilliant people who performed rigorous testing. Something that "seems kinda similar" may or may not be of similar quality (more likely not). FWIW, AES is built-in as a primitive instructruction on most modern general-purpose CPUs, so can be very fast to perform. On Mon, Nov 14, 2022 at 10:56 AM Barry <barry@barrys-emacs.org> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Tue, 15 Nov 2022 at 03:16, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
They don't *automatically* apply, but SHA256 exhibits the avalanche effect that all good cryptographic hashes do, so I would expect that it would have the appropriate distribution. But that would be assuming you're doing something like this: def seed(input): global _RANDOM_SEED, _RANDOM_COUNTER _RANDOM_SEED = hashlib.sha256(str(input).encode("utf-8")) _RANDOM_COUNTER = itertools.count() _seed(time.time()) # or some other default source def random(): hash = _RANDOM_SEED.copy() hash.update(b"%d" % next(_RANDOM_COUNTER)) return int.from_bytes(hash.digest()) / (1<<256) I can't see this happening in the OP's code. And even if it were, I'm not sure how much it would buy you over the current system of random number generation.
That said, Random123 is *done right* by a number of very brilliant people who performed rigorous testing. Something that "seems kinda similar" may or may not be of similar quality (more likely not). FWIW, AES is built-in as a primitive instructruction on most modern general-purpose CPUs, so can be very fast to perform.
Hashing in general is pretty fast on GPUs, so that may or may not actually be significant. But unless someone's actually complaining about the performance of an RNG, I would be more concerned with its features than its CPU cost, including: 1) Reproducibility, if desired (it sometimes is and sometimes isn't) 2) Appropriate distribution and sequence unpredictability 3) Avoidance of side-channel discoverability, particularly if reproducible etc. One thing that IS interesting, when exploring weird and probably suboptimal RNG algorithms, is finding things that can be done to get "kinda okay" randomness when you don't really have a good source of entropy. Let's say you're running a tiny embedded device that needs to report to an HTTPS server with information. How is it to ensure that its SSL sessions are properly secure? Assuming it doesn't have any sort of actual on-board entropy hardware, what's it to do? Sometimes, an imperfect solution is still better than nothing. ChrisA

I should add, as does the Python documentation, that if you want genuinely non-reproducible random values, the `secrets` module exists and produces the best results possible on a given OS and computer architecture. On Unix-like systems, /dev/random is the best source of entropy you're going to find, and trying to find your own is certainly going to be less good than that source which `secrets` utilizes. On Mon, Nov 14, 2022 at 11:13 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Tue, 15 Nov 2022 at 00:14, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
There's another possibility that you haven't explored. You are only looking at random number generators that produce a linear sequence of numbers. If you add a 'split' function to your generator, that takes one generator and returns two generators that are independent of each other, you can build trees of random numbers, instead of just linear sequences. Those trees also allow parallelisation. (Implementations should take care to ensure that the resulting generators are not correlated. ) You can combine the counter-based approach and the split based approach, of course. If you have a cryptographic hash function, it's relatively easy to give a toy implementation.

For background, Random123 was developed for a supercomputer that does molecular dynamic simulations. In particular, for the Anton supercomputer, complete reproducibility of simulations was/is an important constraint. In concept, in that context, you might want to "jump to timestamp 1 billion, and run forward from there" (based on a saved snapshot of the chemical system at the timestamp). In simulations, a certain element of "reproducible randomness" is relevant to inject. For that specific purpose, the splitting of generators that Matthias mentions isn't especially helpful. But for other purposes, it absolutely is. And yes, most certainly these two techniques could be complementary, they need not be mutually exclusive. On Tue, Dec 6, 2022 at 8:45 PM Matthias Görgens <matthias.goergens@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

https://docs.python.org/3/library/random.html :
https://docs.python.org/3/library/secrets.html#module-secrets PEP 506 – Adding A Secrets Module To The Standard Library https://peps.python.org/pep-0506/#alternatives https://github.com/python/peps/blob/main/pep-0506.txt PEP 12: new PEP template: https://github.com/python/peps/blob/main/pep-0012/pep-NNNN.rst Pseudorandom number generator > Cryptographic PRNGs https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Cryptographic_PR... Random number generator attack > Defenses https://en.wikipedia.org/wiki/Random_number_generator_attack#Defenses /? CSPRNG https://www.google.com/search?q=CSPRNG From "THE LINUX CSPRNG IS NOW GOOD!" https://words.filippo.io/dispatches/linux-csprng/ :
[ get random() is from OpenBSD and LibreSSL ]
"Problems emerge for a unified /dev/*random" (2022) https://lwn.net/Articles/889452/ From https://www.redhat.com/en/blog/understanding-red-hat-enterprise-linux-random... : """ How does the kernel initialize its CSPRNG? The kernel has an “entropy pool,” a place where unpredictable input observed by the kernel is mixed and stored. That pool serves as a seed to the internal CSPRNG, and until some threshold of estimated entropy is reached initially, it is considered uninitialized. Let’s now see how the kernel initializes its entropy pool. 1. After the kernel takes control on power-on, it starts filling its entropy pool by mixing interrupt timing and other unpredictable input. 2. The kernel gives control to systemd. 3. Next, systemd starts and initializes itself. 4. Systemd, optionally, loads kernel modules which will improve the kernel's entropy gathering process on a virtual machine (e.g., virtio-rng). 5. Systemd loads the rngd.service which will gather additional input entropy obtained via a random generator exposed by hardware (e.g., the x86 RDRAND instruction or similar) and jitter entropy1; this entropy is fed back into the kernel to initialize its entropy pool, typically in a matter of milliseconds. After the last step, the kernel has its entropy pool initialized, and any systemd services started can take advantage of the kernel’s random generator. Note that the virtio-rng kernel module loading in step (3), is an optional step which improves entropy gathering in a virtual machine by using the host's random generator to initialize the guest systems in KVM. The rngd.service loading at the final step (4) is what ensures that the kernel entropy pools are initialized on every scenario, and furthermore it continues mixing additional data in the kernel pool during system runtime. """ https://github.com/nhorman/rng-tools/blob/master/fips.c : ```c /* fips.c -- Performs FIPS 140-1/140-2 RNG tests ``` /? FIPS 140-1/140-2 RNG tests https://www.google.com/search?q=FIPS+140-1%2F140-2+RNG+tests /? CMVP "cprng" https://www.google.com/search?q=CMVP+%22cprng%22 https://csrc.nist.gov/publications/detail/fips/140/3/final https://www.google.com/search?q=rng+tests - https://www.johndcook.com/blog/rng-testing/ :
We test RNGs using the standard test suites: PractRand, TestU01 (BigCrush), DIEHARD(ER), NIST SP 800-22.
Randomness tests: https://en.wikipedia.org/wiki/Randomness_test#Notable_software_implementatio... : - https://en.wikipedia.org/wiki/Diehard_tests - https://en.wikipedia.org/wiki/TestU01 - /? NIST 800-22 https://www.google.com/search?q=nist+800-22 /? nist 800-22 site:github.com https://www.google.com/search?q=nist+800-22+site%3Agithub.com - https://github.com/google/paranoid_crypto/blob/main/docs/randomness_tests.md From https://cryptography.io/en/latest/random-numbers/ https://github.com/pyca/cryptography/blob/main/docs/random-numbers.rst : ```rst Random number generation ======================== When generating random data for use in cryptographic operations, such as an initialization vector for encryption in :class:`~cryptography.hazmat.primitives.ciphers.modes.CBC` mode, you do not want to use the standard :mod:`random` module APIs. This is because they do not provide a cryptographically secure random number generator, which can result in major security issues depending on the algorithms in use. Therefore, it is our recommendation to `always use your operating system's provided random number generator`_, which is available as :func:`os.urandom`. For example, if you need 16 bytes of random data for an initialization vector, you can obtain them with: .. doctest:: >>> import os >>> iv = os.urandom(16) This will use ``/dev/urandom`` on UNIX platforms, and ``CryptGenRandom`` on Windows. If you need your random number as an integer (for example, for :meth:`~cryptography.x509.CertificateBuilder.serial_number`), you can use ``int.from_bytes`` to convert the result of ``os.urandom``: .. code-block:: pycon >>> serial = int.from_bytes(os.urandom(20), byteorder="big") In addition, the `Python standard library`_ includes the ``secrets`` module, which can be used for generating cryptographically secure random numbers, with specific helpers for text-based formats. .. _`always use your operating system's provided random number generator`: https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ .. _`Python standard library`: https://docs.python.org/3/library/secrets.html ``` On Mon, Nov 14, 2022, 10:57 AM Barry <barry@barrys-emacs.org> wrote:

QRNG "Quantum Random Number Generation" -> Hardware random number generator
Physical phenomena with random properties > Quantum random properties https://en.wikipedia.org/wiki/Hardware_random_number_generator#Quantum_rando...
FWIW, SciPy and SymPy have various non-CSPRNG random distributions: - https://docs.scipy.org/doc/scipy/reference/stats.html - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.htm... - https://docs.sympy.org/latest/modules/stats.html - https://docs.sympy.org/latest/modules/stats.html#sympy.stats.Uniform - https://docs.sympy.org/latest/modules/stats.html#sympy.stats.DiscreteUniform "RFC 8937: Randomness Improvements for Security Protocols" (2020) https://www.rfc-editor.org/rfc/rfc8937.html *** FWIW, UUIDs and W3C DIDs require entropy e.g. from a CSPRNG or better: https://www.w3.org/TR/did-core/#terminology :
On Mon, Nov 14, 2022, 11:54 AM Wes Turner <wes.turner@gmail.com> wrote:

Thank you for replying with such specific assistance. I am made acutely aware that I am only a Python enthusiast, and not an academic. Hashes are deterministic, not random, but byte by byte, they can be very random. Please accept the attached script as a "hack," that might be novel, or a curiosity. My first script was defective several ways. In this, I seed the hash by hashing the uu-8 ENCODED string value of the time - it is not random, but it is rarely the same twice. I used 0-9 because it is decimal, and 0-59, because hours and minutes come in 60's. I found that taking time.time_nc() gave me far more odds than evens. I do not know how the scheme I devised of adding hex digits together would scale to larger ranges - it is done by adding hex digits that vary from 0 - 15 repeatedly, but, since the number of hex digits varies with the range, it could actually work for larger ranges. The random function in Python is not really adequate for a magic eight ball program, so as a consumer I am dissatisfied without examining it at a quantum level, for repeated reference thousands of times each. Thank you for your kind attention, and I hope the (much improved) hack can be helpful to other enthusiasts. James J (A.A. Faulkner University) On Mon, Nov 14, 2022 at 11:23 AM Wes Turner <wes.turner@gmail.com> wrote:

While FWIU it is advisable to keep seeding an RNG after startup - that's what e.g. rngd does - the cryptography.io docs do advise to just use `os.urandom()` (which is not the same as random.SystemRandom()?). There probably should be better default random in CPython; though I'm personally not at all qualified to assess, TIL about NIST 800-22 and FIPS 140-2 for evaluating sources of entropy. Differential entropy > Differential entropies for various distributions https://en.wikipedia.org/wiki/Differential_entropy *** (TIL that quantum information is never destroyed; so which physical processes are actually nonreversible like hashing and RNG are supposed to be is up for consideration as classical information is a subset of quantum information. Do black holes shift gamma radiation may actually be relevant! Perhaps a permanent magnet under a (double-jointed?) bar and ball pendulum on a magnetic bearing would produce enough Brownian motion to get enough uniform random to call it sufficiently entropic for purposes of CSPRNG? Nondeterministic NDE fluid calculations diffract into many possible outcomes, but a brute force combinatorial search of all the possible return values is still deterministically ordered. And the quantum computing folks are working on increasing coherence / reducing error propagation; like ECC.) - [ ] DOC: The docs could advise regarding which decent enough open source hw RNG would be supported by os.urandom or random.SystemRandom if rngd is not configured to keep nondeterministically seeding with entropy that should presumably be from a Uniform random entropic process There should be better software random in python. On Tue, Nov 15, 2022, 1:19 AM James Johnson <jj126979@gmail.com> wrote:

I want to be good natured about it, not combative. I tried whatever random function I found in documentation online, and used it. Without analyzing the world, I was simply dissatisfied with the results. Here’s what I settled on. I actually “put my thumb on the scales,” to rule out repetitions for seven questions, so randomness didn’t ultimately answer my question. I’m not sure that the function I “got” was the Mersenne Twister. I do know Mersenne was a truly great mathematician. I acknowledge that updating with time.asctime() EVERY time I update the hash would be random; I’m not qualified to know if it would be “more” random. After you ask about the bell curve v square distributions, I am quite out of my depth. Here’s my “wizard” for anyone who wants to sell him cryptocurrency or overcome his objections to the environmentalist agenda, or query him about foreign policy. I hope you find it better than average. https://drive.google.com/file/d/1EqQsMfBHDrNpOBQrI7CJxrpbowvKPD2G/ Regards, James J On Tue, Nov 15, 2022 at 4:31 AM Wes Turner <wes.turner@gmail.com> wrote:

On Tue, 15 Nov 2022 at 21:47, James Johnson <jj126979@gmail.com> wrote:
I tried whatever random function I found in documentation online, and used it. Without analyzing the world, I was simply dissatisfied with the results.
Can you explain what you were dissatisfied by? It's worth noting that, for a lot of purposes, people don't WANT true randomness (eg your choice to reject duplicates, which is a violation of randomness, but will make things more satisfying for humans). But that's completely different from trying to come up with a better RNG. ChrisA

James Johnson writes:
On the Python lists, you'll get much better discussion if you're specific about your "dissatisfaction". (If it's just the repetition problem, that's better handled by composing with a filter than by altering the PRNG itself, which should be as close to uniform as possible to make generating other distributions as simple as possible.) Your method *as described* is unclear. Don Knuth tried something similar (you have better tools available than he did, but the basic idea seems to be similarly "let's compose a few pseudo-randomizing transformations"), and got an embarrassing result. (Spoiler in footnote [1] if you don't feel like digging through Seminumerical Algorithms.) You may have had more method than Knuth did (cf Dave Mertz's statement that it "looks like Random123"), but again, it would really help communication if you described why you think your PRNG will do better than Python's current one. It's a heavily researched field for many decades now, and there are quite a few core devs with a lot of knowledge and experience in the field (Python apps provide quite of bit of the "attack surface" exposed on the Internet, it would be irresponsible if there weren't!) By the way, nobody here is going to laugh if it turns out that your thinking was naive. My point is to communicate more effectively, not that your idea is bad (despite having a copy of Seminumerical Algorithms, I'm not competent to evaluate your algorithm itself :-). How you feel about "getting schooled" is a personal thing, but my personal experience has been that there's an excellent education to be had by posting naive ideas to this list. :-) Sincere regards, N.B. Deliberately not trimming so you have to work to see the spoiler. :-)
Footnotes: [1] He composed several pseudo-random transformations, mostly linear congruential but also including at least one nonlinear one (middle square). The resulting sequence always converged to a short cycle within a handful of iterations.

I am an amateur enthusiast. The random function I found in documentation was in 2014 or so. I do not have the vocabulary to differentiate between various prng's. I only knew of one, and I it returned some version of the bell curve, with three favorite values, and one of THEM a pronounced preference. The scholars here are referencing white papers and using hardware and software interchangeably (?) I am not a randomization expert, and I do not seek academic recognition. It's a novel hack is all, and I don't know how to invoke the quality algorithms you reference. Chris A. asked me to "post" a module. I'm pretty sure he means on git-hub, and I don't know how to use it. I attach a script with a randbelow() function. The kernel of my idea is to use randomness from a hash value, so I declared the hash globally, and called it in the function. I know this may not be correct, but I don't know how to declare one momentarily and then destruct it or release the ram the next minute, between random numbers. Normally, I would update the hexdigest outside the function, and pass it to randbelow(range, hexdigest_string) as two arguments. Chris A wants it to feed an analysis program that may not agree well with the "permanence" of a hash between random numbers. I am willing to put in effort, but it's a steep learning curve for me. I DO understand that my first script was a lot of data entry, data checking and elementary reporting that was not germane. I use my fast computer off-line, and connect to the web with a Chromebook, so online tools would mean changing my setup. I remain interested to help and learn. I hope this is not too bare-bones. On Tue, Nov 15, 2022 at 7:17 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:

On Wed, 16 Nov 2022 at 00:59, James Johnson <jj126979@gmail.com> wrote:
Chris A. asked me to "post" a module. I'm pretty sure he means on git-hub, and I don't know how to use it. I attach a script with a randbelow() function.
No no, here is absolutely fine :) Although the code you posted doesn't run. There are a couple of syntactic errors, and after that, a NameError that I'm not sure how to fix. Can you make sure the code runs please? ChrisA

"Do hash functions such as SHA-2 and SHA-3 output random numbers, and are they therefore good P/RNGs?" https://www.quora.com/Do-hash-functions-such-as-SHA-2-and-SHA-3-output-rando... - ```quote See NIST.SP.800–90A. It details how to use both hash functions and block ciphers as PRNGs. When used properly (see NIST publication), hash functions are perfectly suitable as PRNGs and are used that way frequently. ``` - NIST SP 800-90A Rev. 1 "Recommendation for Random Number Generation Using Deterministic Random Bit Generators" https://csrc.nist.gov/publications/detail/sp/800-90a/rev-1/final PDF: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90Ar1.pdf Keywords: Deterministic random bit generator (DRBG); entropy; hash function; random number generator. - FWIW, {blockchain xyz(*) SHA-256}'s block hash digits do appear to be conditionally independent in hexadecimal (radix 16) with tx messages and nonces for inputs. - (*) informal study of XBT block hashes - ```quote There are a number of reasons why hash functions should not be used as RNGs, and hash Merkle-Dmagård constructions (including SHA1 and SHA2) should not be used directly for key derivation. That is, something like SHA2(salt+secret) should not be used to generate a key. Instead if you want to use SHA2 to derive a key, you should use it within an HMAC construction, or better still HKDF as a key derivation function (KDF). SHA3 is different. One of the design criteria for the SHA3 is that not have that limitation of its predecessors. So in principle it can be used that way. However, you still should roll your own KDF from SHA3. Instead you should use a professionally designed and implementation KDF. Internally, a KDF that uses SHA3 will be more direct in its use of the hash function than one designed to handle the limitations of SHA2, if you do try to roll your own you are less likely to shoot yourself in the foot with SHA3. But if you are asking the question you are asking, you definitely should not be rolling your own. Now perhaps you weren’t thinking about key derivation. Perhaps your question is more theoretical. In a his case it is important to understand that among other things, a secure RNG must keep its internal state secret. That way, revealing one key does not reveal information about other keys. Hash functions are not really designed with that in mind. (Well, SHA3 is, but it is still not designed to be an RNG.) SHA2 would make a terrible random number generator. SHA3 comes a lot closer, but still isn’t build for the purpose. ``` On Tue, Nov 15, 2022 at 8:58 AM James Johnson <jj126979@gmail.com> wrote:

I composed in VI, and the tabs were a problem. I apologize; there is no good excuse not to have called the function before attaching it. I had a logic problem. Accumulating the decimal conversion of hex needed to be modular divided to be within range, and I had the name of the hex string (pool) wrong. It runs as a program now. From quickly testing large ranges, it hovers near the mean, but on 10 and 60, it seemed to do ok. I am far less optimistic now. James On Tue, Nov 15, 2022 at 7:56 AM James Johnson <jj126979@gmail.com> wrote:

James Johnson writes:
I'm no expert either, but hash functions and PRNGs have been around a long time. I'm sure this approach has been discovered independently. Besides the Mersenne twister that Python uses, POSIX systems like Linux, MacOS, and BSD provide a number of random number generation facilities, briefly documented in the man pages. BSD man pages for random and arc4random mention the Fortuna algorithm and the arc4 algorithm. At least the latter is similar to yours in that in "mixes in" additional entropy gradually as more random numbers are generated. For sources of "entropy" (random bits) the man page on /dev/random and /dev/urandom provide nice brief explanations.

These are very many reasons that Python PRNG is indeed adequate. Maybe documentation for amateurs and newbies could be improved so that they get the best results possible without knowing how much work it is. How to call the built in PRNG seems to change a good bit. As for the hack: I feel obligated to connect to the internet and do research before I speak further. IF it DOES do 0-9 randomly, even if it is not useful for larger ranges, it may serve by CONCATENATING base 10? I feel like it’s the punchline to a lame joke now, but thank you all again. I will be reading for weeks. On Wed, Nov 16, 2022 at 9:36 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:

On Tue, Nov 15, 2022 at 12:19:48AM -0600, James Johnson wrote:
The random function in Python is not really adequate for a magic eight ball program,
That is an astonishing claim for you to assert without evidence. Python's PRNG is the Mersenne Twister, which is one of the most heavily studied and best PRNGs known. Yes, it's a bit old now, but it still gets the job done, and is more than adequate for non-security related randomness, including magic eight ball programs. There are PRNGs which are faster, or use less memory, or have better statistical properties, but not usually all at once. The MT is the work-horse of modern PRNGs. What properties do you think it lacks which are necessary for a magic eight ball program? Arguably, the Mersenne Twister is hugely overkill for a M8B program. It has a period of at least 2**19937-1, or about 10**6001 (that's a 1 followed by 6001 zeroes). Which means that if you ran your M8B a billion billion billion times a second, it would take more than a billion billion billion [five thousand more billions] billion years before the cycle of responses started again. Of course there are cleverer ways of predicting the output of a Mersenne Twister than waiting for the cycle to repeat, but it's a magic eight ball. Who cares? Even a 32-bit linear congruential generator, so long as it is not utterly dire, would do the job. -- Steve

On Tue, 15 Nov 2022 at 01:34, James Johnson <jj126979@gmail.com> wrote:
I wrote the attached python (3) code to improve on existing prng functions. I used the time module for one method, which resulted in disproportionate odd values, but agreeable means.
Current time of day is NOT random, and the low bits depend wildly on the available computer clock. Many (all?) computer clocks today do not have actual nanosecond resolution, so taking just the low bits in this way is going to give badly skewed values.
I used the hashlib module for the second. It is evident that the code is amateur, but the program might result in better PRN generation.
Hashing is a known way to improve the distribution of random numbers, but it doesn't add entropy. You need to first have actual entropy to work with.
The "app" lends itself to checking, using statistical tools (see comments.)
Averages are only one possible test for random distribution. Here are a bunch more: https://calmcode.io/blog/inverse-turing-test.html Try your random number generation there, and see how random it truly is. In what way is your code better than current? ChrisA

On Sat, Nov 05, 2022 at 01:37:30AM -0500, James Johnson wrote:
First the good news: your random number generator at least is well distributed. We can look at the mean and stdev, and it is about the same as the MT PRNG:
There's no real difference there. But let's look at rising and falling sequences. Let's walk through the two runs of random digits, and take a +1 if the value goes up, -1 if it goes down, and 0 if it stays the same. With a *good quality* random sequence, one time in ten you should get the same value twice in a row (on average). And the number of positive steps and negative steps should be roughly the same. We can see that with the MT output:
That's quite close to the expected 100,000 zeroes we would expect. And the +1s and -1s almost cancel each other out, with only a small imbalance:
sum(steps_mt) 431
The ratio of -ve steps to +ve steps should be 1, and in this sample we get 0.99904 which is pretty close to what we expect. These results correspond to sample probabilities: * Probability of getting the same as the previous number: 0.09983 (theorectical 0.1) * Probability of getting a larger number than the previous: 0.45030 (theoretical 0.45) * Probability of getting a smaller number than the previous: 0.44987 (theoretical 0.45) So pretty close, and it supports the claim that MT is a very good quality RNG. But if we do the same calculation with your random function, we get this:
Wow! This gives us probabilities: * Probability of getting the same as the previous number: 0.09615 (theorectical 0.1) * Probability of getting a larger number than the previous: 0.41062 (theoretical 0.45) * Probability of getting a smaller number than the previous: 0.49323 (theoretical 0.45) I ran the numbers again, with a bigger sample size (3 million instead of 1 million) and the bias got much worse. The ratio of -ve steps to +ve steps, instead of being close to 1, was 1.21908. That's a 20% bias. The bottom line here is that your random numbers, generated using the time, have strong correlations between values. And that makes it a much poorer choice for a PRNG. -- Steve
participants (8)
-
Barry
-
Chris Angelico
-
David Mertz, Ph.D.
-
James Johnson
-
Matthias Görgens
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Wes Turner