random.randint() slow, esp in Python 3
dickinsm at gmail.com
Sat Sep 24 14:53:41 CEST 2011
On Sep 24, 10:38 am, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Sat, Sep 24, 2011 at 12:55 AM, Steven D'Aprano
> <steve+comp.lang.pyt... at pearwood.info> wrote:
> > If you want unbiased, random (or at least pseudo-random) integers chosen
> > from an uniform distribution with proper error checking, you should use
> > randint or randrange.
> >> but if you just
> >> want a pile of more-or-less random integers, int(random.random()*top)
> >> is the best option.
> > "I only need random-ish ints, but I need 'em yesterday!!!"
> > If you want biased, not-quite-uniformly distributed integers with no error
> > checking, then the above will save you a few nanoseconds per call. So if
> > your application needs a million such ints, you might save a full five
> > seconds or so. Sounds like premature optimization to me, but what do I
> > know? If you have an application where the quality of randomness is
> > unimportant and generating random ints is a genuine performance bottleneck,
> > then go right ahead.
> Peeking under the hood, after all the error-checking and extra
> features, randrange basically just does int(random.random() * top).
> Any bias present in the latter will also be present in the former.
In Python 2.x that's true, and indeed randrange has some fairly bad
behaviour for large arguments ( see http://bugs.python.org/issue9025
). In Python 3.2 and up, randrange is more careful: it doesn't
introduce any extra bias beyond that already present in the random
source (Mersenne Twister). If you look at the source, you'll see that
Python 2.x only calls _getrandbits for large arguments, while Python
3.2+ calls _getrandbits for *every* invocation of randrange. (All
assuming that we're using the default MT random number generator.)
This may well explain the difference in timings observed by the OP.
More information about the Python-list