[Numpy-discussion] random.choice(replace=False) very slow

Matthew Brett matthew.brett at gmail.com
Wed Oct 17 13:47:13 EDT 2018


Hi,

I noticed that numpy.random.choice was very slow, with the
replace=False option, and then I noticed it can (for most cases) be
made many hundreds of times faster in Python code:

In [18]: sample = np.random.uniform(size=1000000)
In [19]: timeit np.random.choice(sample, 500, replace=False)
        42.1 ms ± 214 µs per loop (mean ± std. dev. of 7 runs, 10
loops each)
IIn [22]: def rc(x, size):
    ...:     n = np.prod(size)
    ...:     n_plus = n * 2
    ...:     inds = np.unique(np.random.randint(0, n_plus+1, size=n_plus))[:n]
    ...:     return x[inds].reshape(size)
In [23]: timeit rc(sample, 500)
86.5 µs ± 421 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)each)

Is there a reason why it's so slow in C?  Could something more
intelligent than the above be used to speed it up?

Cheers,

Matthew


More information about the NumPy-Discussion mailing list