[Numpy-discussion] extracting a random subset of a vector
Rick White
rlw at stsci.edu
Tue Aug 31 12:49:06 EDT 2004
On Tue, 31 Aug 2004, Curzio Basso wrote:
> Hi all, I have an optimization problem.
>
> I currently use the following code to select a random subset of a rank-1
> array:
Here's a slightly faster version. It's about 3x faster than Chris Barker's
version (4x faster than your original version) for N=1000000, M=100:
import numarray as NA
import numarray.random_array as RA
from math import sqrt
N = 1000000
M = 100
full = NA.arange(N)
r = RA.random(N)
thresh = (M+3*sqrt(M))/N
subset = NA.compress(r<thresh, full)
while len(subset) < M:
# rarely executed
thresh = thresh+3*sqrt(M)/N
subset = NA.compress(r<thresh, full)
subset = subset[RA.permutation(len(subset))[:M]]
By the way, I also find that most of the time gets spent in the
permutation computation. That's why this is faster -- it gets do a
smaller permutation.
Rick
More information about the NumPy-Discussion
mailing list