Probability selection algorithm

Mel Wilson mwilson at
Mon Feb 3 22:12:10 CET 2003

In article <7xhebnscrj.fsf at>,
Paul Rubin <phr-n2003b at> wrote:
>Paul Rubin <phr-n2003b at> writes:
>> [ ... ]
>> where urand returns a random number between 0.0 and 1.0 and
>> problist is your list of probabilities.  "selected" is set to
>> the appropriate randomly chosen index.
>> Do I have that right?
>Whoops, no I didn't have it right.  Try this instead:
>    def sprob(problist):
>        remaining = 1.0
>        for i in range(len(problist)):
>            p = problist[i]
>            if urand()*remaining < p:
>                return i
>            remaining -= p
>        raise "shouldn't get here"
>Again, this depends on your probability list adding up to 1.
>Otherwise, add up the probabilities and scale everything appropriately.

   There's a variant that works nicely with unnormalized

    selected = None # in case the set of pairs is empty
    cumulative = 0
    for weighting, payload in set_of_weighting_payload_pairs:
        cumulative += weighting
        if urand() * cumulative < weighting:
            selected = payload

`weighting` is the histogram value correlating with the
probablilty of selecting its corresponding payload.
Weightings don't have to sum to 1.0 .

   Good for random Markov text generators etc.  It's a
generalization of the algorighthm that chooses a fortune
cookie from an n-cookie file giving each a probability of
1/n for being chosen.

        Regards.    Mel.

More information about the Python-list mailing list