[Python-ideas] About adding a new iteratormethodcalled "shuffled"

Steven D'Aprano steve at pearwood.info
Thu Mar 26 00:58:58 CET 2009


On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote:
> On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano <steve at pearwood.info> 
wrote:
> > Don't get me wrong, random.shuffle() is perfectly adequate for any
> > use-case I can think of. But beyond 2080 items in the list, it
> > becomes greatly biased, and I think that's important to note in the
> > docs. Those who need to know about it will be told, and those who
> > don't care can continue to not care.
>
> Why anyone would care? Orderings possible to obtain from a given good
> random number generator are quite uniformly distributed among all
> orderings. 

Yes, that holds true for n <= 2080, since Fisher-Yates is an unbiased 
shuffler. But I don't think it remains true for n > 2080 since the vast 
majority of possible permutations have probability zero. I'm not saying 
that this absolutely *will* introduce statistical bias into the 
shuffled lists, but it could, and those who care about that risk 
shouldn't have to read the source code to learn this.


> I bet you can't even predict any particular ordering which 
> is impossible to obtain.

A moral dilemma... should I take advantage of your innumeracy by taking 
you up on that bet, or should I explain why that bet is a sure thing 
for me? *wink*

Since the chances of me collecting on the bet is essentially near zero, 
I'll explain.

For a list with 2082 items, shuffle() chooses from a subset of 
approximately 0.001% of all possible permutations. This means that if I 
give you a list of 2082 items and tell you to shuffle it, and then 
guess that such-and-such a permutation of it will never be reached, I 
can only lose if by chance I guessed on the 1 in 100,000 permutations 
that shuffle() can reach. I have 99,999 chances to win versus 1 to 
lose: that's essentially a sure thing.

In practical terms, beyond (say) 2085 or so, it would be a bona fide 
miracle if I didn't win such a bet.



-- 
Steven D'Aprano



More information about the Python-ideas mailing list