Simple algorithm question - how to reorder a sequence economically
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Fri May 24 09:57:21 EDT 2013
On Fri, 24 May 2013 06:23:14 -0700, Peter Brooks wrote:
> Thanks for the warnings about random numbers too - I hope my lists will
> be short enough for the permutations of the function to be irrelevant. I
> don't need every single sequence to be unique, only that the same
> sequence only occurs occasionally. My lists might get up to the ~2k
> length one day, and I'll keep in mind the need, at that stage, to use a
> different pseudo-random generator. Actually, thinking about it, there is
> probably a source of non-algorithmically-derived 'random' numbers
> somewhere on the net that would do the job nicely.
That's massive overkill.
Take a closer look at what Ned wrote:
"The default random number generator has a period of 2**19937-1"
and consider the numbers.
For a list with 3000 items, there are 3000! possible permutations. That
is approximately 10**21024. That is, a number with 21024 digits, or
somewhere around a trillion trillion trillion ... trillion trillion,
where there are 1752 "trillions".
If you could generate a million permutations a second, it would take you
on average 10**210988 centuries before getting the original permutation
again. That's the expected result you would get with a source of true
randomness.
Instead, with Python's default pseudo-random number generator, you cannot
get the full 3000! permutations, but only 2**19937-1 of them. Using the
same calculation as above, that means that you will have to generate a
million permutations per second for "only" 10**13783 centuries before
getting the original permutation again.
There are uses where being able to generate any possible permutation is
important, and the default PRNG is not sufficient. But merely shuffling
your data around to avoid spurious correlations is not one of them. Save
yourself a lot of development effort, and debugging, and just use
random.shuffle.
--
Steven
More information about the Python-list
mailing list