Pick items from list with probability based upon property of list member ?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Jun 20 10:24:22 EDT 2010


On Sun, 20 Jun 2010 11:27:30 +0000, Mel wrote:

> southof40 wrote:
> 
>> I have list of of N Vehicle objects - the only possible vehicles are
>> cars, bikes, trucks.
>>
>> I want to select an object from the list with a probability of : cars
>> 0.7, bikes 0.3, trucks 0.1.
>>
>> I've currently implemented this by creating another list in which each
>> car object from the original list appears 7 times, each bike 3 times
>> and each truck once. I then pick at random from that list.
> 
> This seems about right.  It's like a lottery where the more likely
> winners have more tickets, but all tickets are the same.  Pick one to
> pick the winner.

It could be expensive to calculate though. Suppose the probabilities were:

0.608729
0.235012
0.156259


> There's a very expensive, but flexible technique that effectively gives
> some tickets a better chance than others.  You have to examine each
> ticket individually, so this algorithm burns random numbers like
> kindling:
> 
> 
> 
> import random
> 
> #===============================
> def weighted_choice (candidates, weight):
>     chosen = None
>     total = 0
>     for c in candidates:
>         w = weight (c)
>         total += w
>         if random.randint (0, total-1) < w:
>             chosen = c
>     return chosen
> #===============================

Nice! But instead of randint(0, total-1), you can use randrange(0, total).

This only requires N random numbers (for N candidates), which isn't 
really that excessive, but we can do better, and support non-integer 
weights as well.

def weighted_choice(candidates, weights):
    """Choose between a list of candidates with a list of weights."""
    cumulative_weights = []
    running_total = 0
    for w in weights:
        running_total += w
        cumulative_weights.append(running_total)
    x = random.uniform(0, running_total)
    for item, cw in zip(candidates, cumulative_weights):
        if x <= cw:
            return item

This is O(N) on the number of candidates, and requires one call to random 
no matter what N is.



-- 
Steven



More information about the Python-list mailing list