random.randint() slow, esp in Python 3

Sat Sep 24 08:53:41 EDT 2011

On Sep 24, 10:38 am, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Sat, Sep 24, 2011 at 12:55 AM, Steven D'Aprano
>
>
>
>
>
>
>
>
>
> <steve+comp.lang.pyt... at pearwood.info> wrote:
> > If you want unbiased, random (or at least pseudo-random) integers chosen
> > from an uniform distribution with proper error checking, you should use
> > randint or randrange.
>
> >> but if you just
> >> want a pile of more-or-less random integers, int(random.random()*top)
> >> is the best option.
>
> > "I only need random-ish ints, but I need 'em yesterday!!!"
>
> > If you want biased, not-quite-uniformly distributed integers with no error
> > checking, then the above will save you a few nanoseconds per call. So if
> > your application needs a million such ints, you might save a full five
> > seconds or so. Sounds like premature optimization to me, but what do I
> > know? If you have an application where the quality of randomness is
> > unimportant and generating random ints is a genuine performance bottleneck,
> > then go right ahead.
>
> Peeking under the hood, after all the error-checking and extra
> features, randrange basically just does int(random.random() * top).
> Any bias present in the latter will also be present in the former.
>
> Cheers,
> Ian

In Python 2.x that's true, and indeed randrange has some fairly bad
behaviour for large arguments ( see http://bugs.python.org/issue9025
).  In Python 3.2 and up, randrange is more careful:  it doesn't
introduce any extra bias beyond that already present in the random
source (Mersenne Twister).  If you look at the source, you'll see that
Python 2.x only calls _getrandbits for large arguments, while Python
3.2+ calls _getrandbits for *every* invocation of randrange.  (All
assuming that we're using the default MT random number generator.)
This may well explain the difference in timings observed by the OP.

--
Mark