[Numpy-discussion] Pull Request Review: R-like sample function

Christopher Jordan-Squire cjordan1 at uw.edu
Thu Sep 1 18:02:28 EDT 2011


Hi--I've just submitted a numpy 2.0 pull request for a function sample
in np.random. It's essentially an implementation of R's sample
function. It allows possibly non-uniform, possibly without-replacement
sampling from a given 1-D array-like. This is very useful for quickly
and cleanly creating samples from, for example, a list of strings or a
list of non-contiguous, non-evenly spaced integers. Both occur in data
analysis with categorical data.

It is, essentially, a convenience function that wraps a number of
existing ways to take a random sample. I think it belongs in
numpy.random rather than scipy.stats because it's just a random
sampler, rather than a probability distribution. It isn't possible to
define a scipy.stats discrete random variable on strings--it would
have to instead be done on the indices of the list containing the
possible samples. And (as far as I can tell) the scipy.stats
distributions can't be used for sampling without replacement.

https://github.com/numpy/numpy/pull/151

-Chris JS



More information about the NumPy-Discussion mailing list