
On Wed, 25 Mar 2009 12:59:27 pm Raymond Hettinger wrote:
Greg> Think about it: A shuffling algorithm is a function from a random Greg> number to a permutation. There's no way you can get more permutations Greg> out than there are random numbers to put in.
If our random number generator can produce more possible shuffles than there are atoms in the universe, I say you don't worry about it.
No, I'm afraid that is a fallacy, because what is important is the number of permutations in the list, and that grows as the factorial of the number of items. The Mersenne Twister has a period of 2**19937-1, which sounds huge, but it takes a list of only 2081 items for the number of permutations to exceed that. To spell it out in tedious detail: that means that random.shuffle() can produce every permutation of a list of 2080 items, but for 2081 items approximately 98% of the possibilities can't be reached. For 2082 items, approx 99.999% will never be reached. And so on. Don't get me wrong, random.shuffle() is perfectly adequate for any use-case I can think of. But beyond 2080 items in the list, it becomes greatly biased, and I think that's important to note in the docs. Those who need to know about it will be told, and those who don't care can continue to not care. -- Steven D'Aprano