[Python-Dev] a note in random.shuffle.__doc__ ...
tim.peters at gmail.com
Sat Jun 10 23:31:39 CEST 2006
> Note that for even rather small len(x), the total number of
> permutations of x is larger than the period of most random number
> generators; this implies that "most" permutations of a long
> sequence can never be generated.
> Now -- why would the behavior of "most" random number generators be
> relevant here? The module's docs claim, for its specific Mersenne
> Twister generator, a period of 2**19997-1, which is (e.g.)
Oops! That's wrong. The period is 2**19937-1.
> a comfortable
> 859403269022650639413550466514556014961826309062543 times larger than
> the number of permutations of 2000 items, which doesn't really feel
> to me like a "rather small len(x)" in this context (what I'm most
> often shuffling is just a pack of cards -- len(x)==52 -- for example).
> I suspect that the note is just a fossil from a time when the default
> random number generator was Whichman-Hill, with a much shorter
> period. Should this note just be removed, or instead somehow
> reworded to point out that this is not in fact a problem for the
> module's current default random number generator? Opinions welcome!
It should be removed now. I'll do that :-)
WH's period was indeed so short that it couldn't generate even a tiny
fraction of the permutations of a deck of cards, and that's why the
note was added.
While a long period is necessary to get a shot at all permutations,
it's not sufficient, and I don't know what the true story is wrt the
Twister. For example, a miserable PRNG that returns
0.1, 0.2, 0.2,
0.1, 0.2, 0.2, 0.2,
0.1, 0.2, 0.2, 0.2, 0.2,
has infinite period, but has few (O(N)) distinct subsequences of
length N. That's a failure of so-called equidistribution in N
dimensions (for sufficiently large N, some N-vectors appear more often
than others across the whole period). "A long" period is necessary
but not sufficient for high-dimensional equidistribution.
Off the top of my head, then, since the Twister is provably
equidistributed in 623 dimensions to 32-bit accuracy, I expect it
should be able to "fairly" generate all permutations of a sequence of
<= 623 elements (equidistribution in N dimensions implies
equidistribution in all dimensions <= N). So I'm happy to leave a
warning out until the casinos switch to 12-deck blackjack ;-)
More information about the Python-Dev