
On Thu, Mar 26, 2009 at 4:31 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Adam Olsen writes:
> If random.shuffle() is broken for lists more than 2080 then it should > raise an error.
Not really. Assuming the initial state is drawn from a uniform distribution on all possible states, if all 2080-shuffles are equiprobable, then any statistic you care to calculate based on that will come out the same as if you had 2080 statistically independent draws without replacement. Another way to put it is "if you need a random shuffle, this one is good enough for *any* such purpose".
However, once you exceed that limit you have to ask whether it's good enough for the purpose at hand. For some purposes, the distribution of (2**19937-1)-shuffles might be good enough, even though they make up only 1/(2**19937-2) of the possible shuffles. (Yeah, I know, you can wait....)
Is it or is it not broken? That's all I want to know. "maybe" isn't good enough. "Not broken for small lists" implies it IS broken for large lists. Disabling it (raising an exception for large lists) is of course just a stopgap measure. Better would be a PRNG with a much larger period.. but of course that'd require more CPU time and more seed. -- Adam Olsen, aka Rhamphoryncus