[Tutor] (No Subject)

kevin parks kp87@lycos.com
Fri, 22 Feb 2002 01:59:32 +0900

I am trying to add  weights or probabilities to a list of items to be chosen
randomly from a list.

A straight-ahead random picker would be something like:
>>> import random
>>> x=['one', 'two', 'three']
>>> item = random.choose(x)

here all items have an equal (33.3333% chance) of being picked.

Now i want to say that 'one' has a 10% chance of being picked, 
'two' a 30% chance, and 'three' 60% or whatever...

The poor person's way of doing this is to stack the deck:
x=['one', 'two', 'two', 'two', 'three', 'three', 'three', 'three', 'three', 'three']

But i would rather specify with more flexibility like so:

list1=[('one', 0.10), ('two', 0.30), ('three', 0.60)]
list1=[[one', 0.10], ['two', 0.30], ['three', 0.60]]

and fancier things..

so i am starting with something like this (I am at a PC room with no
interpreter so if i make a type forgive me)... my version...

import random

def wc(lst):
    n = random.uniform(0,1)
    for item, weight in lst:
        if n < weight:
        n = n - weight
    return item


was i headed down the right path here?

i searched c.l.p and found a thread that had the following:

someone said that the subtraction is not 
needed and the comparison with 1.0 can then be eliminated.  By putting largest 
weights first, the remaining number of comparisons is minimized:

list1 = [('three', 0.5), ('two', 0.75)] # else 'one'
def wc(list):
   n = random.uniform(0, 1)
   for item, weight in list:
     if n < weight:
       return item
   return 'one'

I don't understand this since then our last choice has to be hardwired in that
second return statement. I am guessing that the 0.75 is a type and was supposed
to be a 0.25

someone else suggested this with the disclaimer that it 
was not very efficient method and was untested:

def weighted_choice(choices):
    tot = 0
    for w,v in choices:
        tot = tot + w
    d = random.random()*tot
    tot = 0
    for w,v in choices:
        tot = tot + w
        if tot > d:
            return v

I am not yet able to get my head around this one but i put it through danny's histogram()
and it seemed to work.  Which approach is more better (more robust, faster, all-purpose,
efficient, etc.)

Check out Cupid School where you will learn from Matchmaker's
best and brightest. Good Luck!