[Python-Dev] a note in random.shuffle.__doc__ ...

Robert Kern robert.kern at gmail.com
Sat Jun 10 22:56:08 CEST 2006


Alex Martelli wrote:
> ...claims:
> 
> Note that for even rather small len(x), the total number of
> permutations of x is larger than the period of most random number
> generators; this implies that "most" permutations of a long
> sequence can never be generated.
> 
> Now -- why would the behavior of "most" random number generators be  
> relevant here?  The module's docs claim, for its specific Mersenne  
> Twister generator, a period of 2**19997-1, which is (e.g.) a  
> comfortable  
> 130128673800676351960752618754658780303412233749552410245124492452914636 
> 028095467780746435724876612802011164778042889281426609505759158196749438 
> 742986040468247017174321241233929215223326801091468184945617565998894057 
> 859403269022650639413550466514556014961826309062543 times larger than  
> the number of permutations of 2000 items, which doesn't really feel  
> to me like a "rather small len(x)" in this context (what I'm most  
> often shuffling is just a pack of cards -- len(x)==52 -- for example).

I wouldn't be too comfortable with that margin. The fun thing about factorials
is that they add up really quickly. The crossover point is between 2080 and 2081.


In [28]: from scipy import *

In [29]: def f(x):
   ....:     return special.gammaln(x-1) - 19937*log(2)
   ....:

In [30]: optimize.brentq(f, 2000, 3000)
Out[30]: 2082.4031300820125

In [31]: import gmpy

In [32]: mtperiod = 2**19937 - 1

In [33]: for i in range(2075, 2085):
   ....:     if gmpy.fac(i) > mtperiod:
   ....:         print i
   ....:         break
   ....:
   ....:
2081


I think that documenting this boundary might be worthwhile rather than making
vague references to "small len(x)." A note along the lines of Josiah wrote about
there being no guarantees despite period size would also be useful.

OTOH, isn't the exact PRNG algorithm considered an implementation detail? It
certainly was when the module migrated from Wichmann-Hill to the Mersenne
Twister. OTGH, I don't foresee the random module ever using an algorithm with
worse characteristics than the Mersenne Twister.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco



More information about the Python-Dev mailing list