[Python-ideas] About adding a new iteratormethodcalled "shuffled"

Steven D'Aprano steve at pearwood.info
Wed Mar 25 13:28:48 CET 2009

On Wed, 25 Mar 2009 12:59:27 pm Raymond Hettinger wrote:
> > Greg> Think about it: A shuffling algorithm is a function from a
> > random Greg> number to a permutation. There's no way you can get
> > more permutations Greg> out than there are random numbers to put
> > in.
> If our random number generator can produce more possible shuffles
> than there are atoms in the universe, I say you don't worry about it.

No, I'm afraid that is a fallacy, because what is important is the 
number of permutations in the list, and that grows as the factorial of 
the number of items.

The Mersenne Twister has a period of 2**19937-1, which sounds huge, but 
it takes a list of only 2081 items for the number of permutations to 
exceed that.

To spell it out in tedious detail: that means that random.shuffle() can 
produce every permutation of a list of 2080 items, but for 2081 items 
approximately 98% of the possibilities can't be reached. For 2082 
items, approx 99.999% will never be reached. And so on.

Don't get me wrong, random.shuffle() is perfectly adequate for any 
use-case I can think of. But beyond 2080 items in the list, it becomes 
greatly biased, and I think that's important to note in the docs. Those 
who need to know about it will be told, and those who don't care can 
continue to not care.

Steven D'Aprano

More information about the Python-ideas mailing list