generate a random sample based on independent priors in `random.choice` #22082

Proposed new feature or change: Objective: Sample elements from the given iterator (a list, a numpy array, etc.) based upon pre-defined probabilities associated with each element which may not sums upto 1.
Overview * In the numpy.random.choice function the cases where we're explicitly providing the list of probabilistic values for an input sample, the requirement expects the sum of the whole list to be 1. This makes sense when all the elements are possible observation for a single random variable whose pmf(probability mass function) is nothing but the p list. * But when every element in that list can be regarded as observation of separate independent Bernoulli r.v. (random variable), then the problem falls into the category of multi-label. Intuitively speaking, for each element in the list or array given to us, we'll toss a coin (may be biased or unbiased) ~B(p) (i.e., follows Bernoulli with p as success probability for head). * The output array or list would probably be a proper subset or can be a full list or can be an empty one. * Plus, here the argument size should automatically get shut down as we just want which all elements got selected into the output list after n coin tossess (where len(list) = n). Also it may happen that each element is independent as argued above, but the sum(p) = 1. Then we can probably put an extra argument independence = True or independence = False.

As discussed on the corresponding Github issue[1], this is not going to be functionality that we add onto `choice()` nor is it likely something that we will add as a separate `Generator` method. The desired computation is straightforward to do by composing existing functionality.
[1] https://github.com/numpy/numpy/issues/22082
On Thu, Aug 4, 2022 at 2:07 PM Rodo-Singh adityasinghdrdo@gmail.com wrote:
Proposed new feature or change: Objective: Sample elements from the given iterator (a list, a numpy array, etc.) based upon pre-defined probabilities associated with each element which may not sums upto 1.
Overview
- In the numpy.random.choice function the cases where we're explicitly
providing the list of probabilistic values for an input sample, the requirement expects the sum of the whole list to be 1. This makes sense when all the elements are possible observation for a single random variable whose pmf(probability mass function) is nothing but the p list.
- But when every element in that list can be regarded as observation of
separate independent Bernoulli r.v. (random variable), then the problem falls into the category of multi-label. Intuitively speaking, for each element in the list or array given to us, we'll toss a coin (may be biased or unbiased) ~B(p) (i.e., follows Bernoulli with p as success probability for head).
- The output array or list would probably be a proper subset or can be a
full list or can be an empty one.
- Plus, here the argument size should automatically get shut down as we
just want which all elements got selected into the output list after n coin tossess (where len(list) = n). Also it may happen that each element is independent as argued above, but the sum(p) = 1. Then we can probably put an extra argument independence = True or independence = False. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: robert.kern@gmail.com

Every new feature, method, function or keyword argument adds cognitive load, maintainer burden, makes the package larger, and can confuse users so we reject much more than we accept.
The first step would be to present a convincing case _why_ Poisson sampling should be added to NumPy, before discussing the _how_.
Matti
On 4/8/22 21:11, Robert Kern wrote:
As discussed on the corresponding Github issue[1], this is not going to be functionality that we add onto `choice()` nor is it likely something that we will add as a separate `Generator` method. The desired computation is straightforward to do by composing existing functionality.
[1] https://github.com/numpy/numpy/issues/22082
On Thu, Aug 4, 2022 at 2:07 PM Rodo-Singh adityasinghdrdo@gmail.com wrote:
Proposed new feature or change: Objective: Sample elements from the given iterator (a list, a numpy array, etc.) based upon pre-defined probabilities associated with each element which may not sums upto 1. Overview * In the numpy.random.choice function the cases where we're explicitly providing the list of probabilistic values for an input sample, the requirement expects the sum of the whole list to be 1. This makes sense when all the elements are possible observation for a single random variable whose pmf(probability mass function) is nothing but the p list. * But when every element in that list can be regarded as observation of separate independent Bernoulli r.v. (random variable), then the problem falls into the category of multi-label. Intuitively speaking, for each element in the list or array given to us, we'll toss a coin (may be biased or unbiased) ~B(p) (i.e., follows Bernoulli with p as success probability for head). * The output array or list would probably be a proper subset or can be a full list or can be an empty one. * Plus, here the argument size should automatically get shut down as we just want which all elements got selected into the output list after n coin tossess (where len(list) = n). Also it may happen that each element is independent as argued above, but the sum(p) = 1. Then we can probably put an extra argument independence = True or independence = False. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: matti.picus@gmail.com
participants (3)
-
Matti Picus
-
Robert Kern
-
Rodo-Singh