I think this is not possible to do efficiently with just numpy. If you want to do this efficiently, I wrote a no-replacement sampler in Cython some time ago (below). I hearby release it to the public domain.

'''

Created on Oct 24, 2009
http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
@author: johnsalvatier

'''

from __future__ import division

import numpy

def random_no_replace(sampleSize, populationSize, numSamples):



   samples = numpy.zeros((numSamples, sampleSize),dtype=int)



   # Use Knuth's variable names

   cdef int n = sampleSize

   cdef int N = populationSize

   cdef i = 0

   cdef int t = 0 # total input records dealt with

   cdef int m = 0 # number of items selected so far

   cdef double u

   while i < numSamples:

   t = 0

   m = 0

   while m < n :



   u = numpy.random.uniform() # call a uniform(0,1) random number generator

   if (N - t)*u >= n - m :



   t += 1



   else:



   samples[i,m] = t

   t += 1

   m += 1



   i += 1



   return samples

On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac@gmail.com> wrote:

I want to sample *without* replacement from a vector
(as with Python's random.sample). I don't see a direct
replacement for this, and I don't want to carry two
PRNG's around. Is the best way something like this?

permutation(myvector)[:samplesize]

Thanks,
Alan Isaac
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion