I think this is not possible to do efficiently with just numpy. If you want to do this efficiently, I wrote a no-replacement sampler in Cython some time ago (below). I hearby release it to the public domain. ''' Created on Oct 24, 2009 http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-rep... @author: johnsalvatier ''' from __future__ import division import numpy def random_no_replace(sampleSize, populationSize, numSamples): samples = numpy.zeros((numSamples, sampleSize),dtype=int) # Use Knuth's variable names cdef int n = sampleSize cdef int N = populationSize cdef i = 0 cdef int t = 0 # total input records dealt with cdef int m = 0 # number of items selected so far cdef double u while i < numSamples: t = 0 m = 0 while m < n : u = numpy.random.uniform() # call a uniform(0,1) random number generator if (N - t)*u >= n - m : t += 1 else: samples[i,m] = t t += 1 m += 1 i += 1 return samples On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac@gmail.com> wrote:
I want to sample *without* replacement from a vector (as with Python's random.sample). I don't see a direct replacement for this, and I don't want to carry two PRNG's around. Is the best way something like this?
permutation(myvector)[:samplesize]
Thanks, Alan Isaac _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion