[Numpy-discussion] Generating random samples without repeats

Paul Moore pf_moore at yahoo.co.uk
Thu Sep 18 17:55:11 EDT 2008


I want to generate a series of random samples, to do simulations based 
on them. Essentially, I want to be able to produce a SAMPLESIZE * N 
matrix, where each row of N values consists of either

1. Integers between 1 and M (simulating M rolls of an N-sided die), or
2. A sample of N numbers between 1 and M without repeats (simulating
    deals of N cards from an M-card deck).

Example (1) is easy, numpy.random.random_integers(1, M, (SAMPLESIZE, N))

But I can't find an obvious equivalent for (2). Am I missing something 
glaringly obvious? I'm using numpy - is there maybe something in scipy I 
should be looking at?

Also, in evaluating samples, I'm likely to want to calculate 
combinatorial functions, such as the list of all pairs of items from a 
sample (imagine looking at how many pairs add up to 15 in a cribbage 
hand). Clearly, I can write a normal Python function which does this for 
one row, and use apply_along_axis - but that's *slow*. I'm looking for a 
function that, given an N*M array and a sample size S, gives a 
C(N,S)*S*M array of all the combinations, which runs at array-processing 
speeds (preferably without having to code it in C myself!!) Is there 
anywhere with this type of function available?

This type of combinatorial simulation seems to me to be a fairly good 
fit for numpy's capabilities, and yet I can't seem to find things that 
seem relevant. Is it simly not something that people use numpy for? Or 
am I looking in the wrong places in the documentation?

Thanks for any help,
Paul.




More information about the NumPy-Discussion mailing list