Unsorting(randomizing) a sequence

Thu Aug 19 11:58:31 EDT 1999

Tim Peters:
 > Me:
 > > ...
 > > Another, more straightforward way to test sampling strategies is just
 > > to keep track of how many times each permutation comes up.  However,
 > > since N! gets so large, it takes too long to build up decent
 > > statistics this way.
 > 
 > I'll save you a near-infinity of work, then <wink>:  Python's Wichmann-Hill
 > uniform generator has a period of around 1e12 to 1e13. 

Thanks for the labor-savings.  The point I was getting at however was
that even with a perfect random number generator, if you're not
careful, you can write a function to turn those random numbers into
permutations which does not evenly distribute across the space of
permutations - for example favoring even permutations over odd
permutations, etc.  The goodness of the random-number generator is
another issue...

I really should go off and read more Knuth; as Dan Schmidt pointed out
all of this is discussed there; rather than devising clever
statistical tests, it's far better to *prove* from first principles
that the algorithm is correct.

 Since the point at
 > which you begin in the period completely determines the permutation that
 > will be generated, you can't get more than 1e13 distinct permutations out of
 > it.  So a sequence of length 16 is the longest for which it could even
 > conceivably produce all permutations; if you use it to, e.g., shuffle a deck
 > of cards, the number of permutations any algorithm based on WH can get back
 > is a vanishingly small fraction of the possibilities (since 52! ~= 1e66; we
 > can only get back at most 1 of each 1e66/1e13 = 1e53 possibilities).
 > 
 > So, in a real sense, even the best possible algorithm based on
 > random.random() is terrible, incapable of generating "almost all" of the
 > space for a measly deck of cards.
 > 
 > It's surprising that so few people get disturbed by this; but it's even more
 > surprising that some people do <wink>.
 > 
 > the-deck-is-always-rigged-in-subtle-ways-ly y'rs  - tim
 > 
 > 
 > 
 >