implement random selection in Python

Bruza benruza at gmail.com
Fri Nov 16 19:50:50 EST 2007


On Nov 16, 4:47 pm, Bruza <benr... at gmail.com> wrote:
> On Nov 16, 6:58 am, duncan smith <buzz... at urubu.freeserve.co.uk>
> wrote:
>
>
>
> > Bruza wrote:
> > > I need to implement a "random selection" algorithm which takes a list
> > > of [(obj, prob),...] as input. Each of the (obj, prob) represents how
> > > likely an object, "obj", should be selected based on its probability
> > > of
> > > "prob".To simplify the problem, assuming "prob" are integers, and the
> > > sum of all "prob" equals 100. For example,
>
> > >    items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
>
> > > The algorithm will take a number "N", and a [(obj, prob),...] list as
> > > inputs, and randomly pick "N" objects based on the probabilities of
> > > the
> > > objects in the list.
>
> > > For N=1 this is pretty simply; the following code is sufficient to do
> > > the job.
>
> > > def foo(items):
> > >     index = random.randint(0, 99)
> > >     currentP = 0
> > >     for (obj, p) in items:
> > >        currentP += w
> > >        if currentP > index:
> > >           return obj
>
> > > But how about the general case, for N > 1 and N < len(items)? Is there
> > > some clever algorithm using Python standard "random" package to do
> > > the trick?
>
> > I think you need to clarify what you want to do.  The "probs" are
> > clearly not probabilities.  Are they counts of items?  Are you then
> > sampling without replacement?  When you say N < len(items) do you mean N
> > <= sum of the "probs"?
>
> > Duncabn
>
> I think I need to explain on the probability part: the "prob" is a
> relative likelihood that the object will be included in the output
> list. So, in my example input of
>
>   items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
>
> So, for any size of N, 'Tom' (with prob of 45) will be more likely to
> be included in the output list of N distinct member than 'Mary' (prob
> of 30) and much more likely than that of 'John' (with prob of 10).
>
> I know "prob" is not exactly the "probability" in the context of
> returning a multiple member list. But what I want is a way to "favor"
> some member in a selection process.
>
> So far, only Boris's solution is closest (but not quite) to what I
> need, which returns a list of N distinct object from the input
> "items". However, I tried with input of
>
>    items = [('Mary',1), ('John', 1), ('Tom', 1), ('Jane', 97)]
>
> and have a repeated calling of
>
> Ben

OOPS. I pressed the Send too fast.

The problem w/ Boris's solution is that after repeated calling of
randomPick(3,items), 'Jane' is not the most "frequent appearing"
member in all the out list of 3 member lists...




More information about the Python-list mailing list