implement random selection in Python
Bruza
benruza at gmail.com
Fri Nov 16 19:50:50 EST 2007
On Nov 16, 4:47 pm, Bruza <benr... at gmail.com> wrote:
> On Nov 16, 6:58 am, duncan smith <buzz... at urubu.freeserve.co.uk>
> wrote:
>
>
>
> > Bruza wrote:
> > > I need to implement a "random selection" algorithm which takes a list
> > > of [(obj, prob),...] as input. Each of the (obj, prob) represents how
> > > likely an object, "obj", should be selected based on its probability
> > > of
> > > "prob".To simplify the problem, assuming "prob" are integers, and the
> > > sum of all "prob" equals 100. For example,
>
> > > items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
>
> > > The algorithm will take a number "N", and a [(obj, prob),...] list as
> > > inputs, and randomly pick "N" objects based on the probabilities of
> > > the
> > > objects in the list.
>
> > > For N=1 this is pretty simply; the following code is sufficient to do
> > > the job.
>
> > > def foo(items):
> > > index = random.randint(0, 99)
> > > currentP = 0
> > > for (obj, p) in items:
> > > currentP += w
> > > if currentP > index:
> > > return obj
>
> > > But how about the general case, for N > 1 and N < len(items)? Is there
> > > some clever algorithm using Python standard "random" package to do
> > > the trick?
>
> > I think you need to clarify what you want to do. The "probs" are
> > clearly not probabilities. Are they counts of items? Are you then
> > sampling without replacement? When you say N < len(items) do you mean N
> > <= sum of the "probs"?
>
> > Duncabn
>
> I think I need to explain on the probability part: the "prob" is a
> relative likelihood that the object will be included in the output
> list. So, in my example input of
>
> items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
>
> So, for any size of N, 'Tom' (with prob of 45) will be more likely to
> be included in the output list of N distinct member than 'Mary' (prob
> of 30) and much more likely than that of 'John' (with prob of 10).
>
> I know "prob" is not exactly the "probability" in the context of
> returning a multiple member list. But what I want is a way to "favor"
> some member in a selection process.
>
> So far, only Boris's solution is closest (but not quite) to what I
> need, which returns a list of N distinct object from the input
> "items". However, I tried with input of
>
> items = [('Mary',1), ('John', 1), ('Tom', 1), ('Jane', 97)]
>
> and have a repeated calling of
>
> Ben
OOPS. I pressed the Send too fast.
The problem w/ Boris's solution is that after repeated calling of
randomPick(3,items), 'Jane' is not the most "frequent appearing"
member in all the out list of 3 member lists...
More information about the Python-list
mailing list