Probability selection algorithm

Mel Wilson mwilson at the-wire.com
Mon Feb 3 22:12:10 CET 2003


In article <7xhebnscrj.fsf at ruckus.brouhaha.com>,
Paul Rubin <phr-n2003b at NOSPAMnightsong.com> wrote:
>Paul Rubin <phr-n2003b at NOSPAMnightsong.com> writes:
>> [ ... ]
>> where urand returns a random number between 0.0 and 1.0 and
>> problist is your list of probabilities.  "selected" is set to
>> the appropriate randomly chosen index.
>>
>> Do I have that right?
>
>Whoops, no I didn't have it right.  Try this instead:
>
>    def sprob(problist):
>        remaining = 1.0
>        for i in range(len(problist)):
>            p = problist[i]
>            if urand()*remaining < p:
>                return i
>            remaining -= p
>        raise "shouldn't get here"
>
>Again, this depends on your probability list adding up to 1.
>Otherwise, add up the probabilities and scale everything appropriately.

   There's a variant that works nicely with unnormalized
weightings:

    selected = None # in case the set of pairs is empty
    cumulative = 0
    for weighting, payload in set_of_weighting_payload_pairs:
        cumulative += weighting
        if urand() * cumulative < weighting:
            selected = payload

`weighting` is the histogram value correlating with the
probablilty of selecting its corresponding payload.
Weightings don't have to sum to 1.0 .

   Good for random Markov text generators etc.  It's a
generalization of the algorighthm that chooses a fortune
cookie from an n-cookie file giving each a probability of
1/n for being chosen.

        Regards.    Mel.




More information about the Python-list mailing list