[Python-ideas] About adding a new iteratormethodcalled "shuffled"

Steven D'Aprano steve at pearwood.info
Wed Mar 25 13:28:48 CET 2009


On Wed, 25 Mar 2009 12:59:27 pm Raymond Hettinger wrote:
> > Greg> Think about it: A shuffling algorithm is a function from a
> > random Greg> number to a permutation. There's no way you can get
> > more permutations Greg> out than there are random numbers to put
> > in.
>
> If our random number generator can produce more possible shuffles
> than there are atoms in the universe, I say you don't worry about it.

No, I'm afraid that is a fallacy, because what is important is the 
number of permutations in the list, and that grows as the factorial of 
the number of items.

The Mersenne Twister has a period of 2**19937-1, which sounds huge, but 
it takes a list of only 2081 items for the number of permutations to 
exceed that.

To spell it out in tedious detail: that means that random.shuffle() can 
produce every permutation of a list of 2080 items, but for 2081 items 
approximately 98% of the possibilities can't be reached. For 2082 
items, approx 99.999% will never be reached. And so on.

Don't get me wrong, random.shuffle() is perfectly adequate for any 
use-case I can think of. But beyond 2080 items in the list, it becomes 
greatly biased, and I think that's important to note in the docs. Those 
who need to know about it will be told, and those who don't care can 
continue to not care.



-- 
Steven D'Aprano



More information about the Python-ideas mailing list