[Edu-sig] More permutation madness [Cryptonomicon]
Tim Peters
tim.one@home.com
Thu, 30 Nov 2000 16:10:00 -0500
[Daniel Yoo]
> There was a question about ways of generating permutations: is there an
> easy way to show that these two functions:
>
> ###
> def permuteRestricted(L):
> newlist = L[:] # shallow copy
> for i in range(len(L)):
> rand_i = randint(i, len(L)-1)
> newlist[i], newlist[rand_i] = newlist[rand_i], newlist[i]
> return newlist
>
> def permuteUnrestricted(L):
> newlist = L[:] # shallow copy
> for i in range(len(L)):
> rand_i = randint(0, len(L)-1)
> rand_j = randint(0, len(L)-1)
> newlist[rand_j], newlist[rand_i] = newlist[rand_i],
> newlist[rand_j]
> return newlist
> ###
>
> act differently?
> ...
By counting. Let N=len(L). Then in the first version, there are N! (N
factorial) possible outcomes (on the first iteration thru the loop, rand_i
can have any of N values; on the second iteration, any of N-1; and so on).
In the second version, N**(2*N) (on the first iteration thru the loop, the
pair (rand_i, rand_j) can take any of N**2 values; and likewise for all the
other iterations).
Since there are N! possible permutations, that the first version has N!
possible outcomes means it's at least *possible* for each permutation to be
equally likely (the counting argument alone can't prove it's fair, though --
it simply doesn't rule fairness out). But the second version can't possibly
be fair unless N**(2*N) is a multiple of N!. That is, I think, what makes
it so seductive: the second version *is* fair if N happens to be 1 or 2.
People think about the small cases, and leap to the conclusion that it "must
be" fair for larger N too. But for N==3, 6 (3!) does not divide 729 (3**6),
so the second version necessarily favors some permutations.
Note that "the usual" wrong algorithm for generating a random permutation is
a bit simpler than permuteUnrestricted:
def permuteUnrestricted_usual(L):
newlist = L[:] # shallow copy
for i in range(len(L)):
rand_j = randint(0, len(L)-1)
newlist[i], newlist[rand_j] = newlist[rand_j], newlist[i]
return newlist
The same kind of counting argument shows that it can't be fair for len(L)>2.
An exact analysis is quite difficult!
probability-is-full-of-surprises-ly y'rs - tim