implement random selection in Python

Bruza benruza at
Sat Nov 17 01:47:16 CET 2007

On Nov 16, 6:58 am, duncan smith <buzz... at>
> Bruza wrote:
> > I need to implement a "random selection" algorithm which takes a list
> > of [(obj, prob),...] as input. Each of the (obj, prob) represents how
> > likely an object, "obj", should be selected based on its probability
> > of
> > "prob".To simplify the problem, assuming "prob" are integers, and the
> > sum of all "prob" equals 100. For example,
> >    items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
> > The algorithm will take a number "N", and a [(obj, prob),...] list as
> > inputs, and randomly pick "N" objects based on the probabilities of
> > the
> > objects in the list.
> > For N=1 this is pretty simply; the following code is sufficient to do
> > the job.
> > def foo(items):
> >     index = random.randint(0, 99)
> >     currentP = 0
> >     for (obj, p) in items:
> >        currentP += w
> >        if currentP > index:
> >           return obj
> > But how about the general case, for N > 1 and N < len(items)? Is there
> > some clever algorithm using Python standard "random" package to do
> > the trick?
> I think you need to clarify what you want to do.  The "probs" are
> clearly not probabilities.  Are they counts of items?  Are you then
> sampling without replacement?  When you say N < len(items) do you mean N
> <= sum of the "probs"?
> Duncabn

I think I need to explain on the probability part: the "prob" is a
relative likelihood that the object will be included in the output
list. So, in my example input of

  items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]

So, for any size of N, 'Tom' (with prob of 45) will be more likely to
be included in the output list of N distinct member than 'Mary' (prob
of 30) and much more likely than that of 'John' (with prob of 10).

I know "prob" is not exactly the "probability" in the context of
returning a multiple member list. But what I want is a way to "favor"
some member in a selection process.

So far, only Boris's solution is closest (but not quite) to what I
need, which returns a list of N distinct object from the input
"items". However, I tried with input of

   items = [('Mary',1), ('John', 1), ('Tom', 1), ('Jane', 97)]

and have a repeated calling of


More information about the Python-list mailing list