Smarter way of doing this?

Paddy McCarthy paddy3118 at netscape.net
Fri Feb 6 20:09:32 CET 2004


"Tim Peters" <tim.one at comcast.net> wrote in message news:<mailman.1247.1076000449.12720.python-list at python.org>...
> [Max M]
> > ...
> > I am interrested in getting a result as a propertion of the
> > probablilty, or more correctly in this case, the frequency. If I want
> > it as probabilities I just have to normalise the sim to 1.0.
> >
> > This has the advantage that the frequencies can be expressed as
> > integers too. This is nice in my Markov chain class that count words
> > in text, etc.
> >
> > In my example below each letter should be occur 50% of the times of
> > the previous letter.
> 
> ...
> 
> > probabilities = [16, 8, 4, 2, 1]
> > elements = ['a', 'b', 'c', 'd', 'e']
> 
> When the weights are integers, a simpler and faster way to proceed is to
> build a list containing each element a number of times equal to its weight:
> 
> alist = ['a'] * 16 + ['b'] * 8 + ['c'] * 4 + ['d'] * 2 + ['e']
> 
> Then just invoke random.choice on the list.  This is a fast, constant-time
> weighted selection technique, but consumes space proportional to the sum of
> the weights.

Indeed, I use something similar in my code.
I allow a dictionaries value to represent the integer weight when
randomly choosing the dictionary keys. I use the following code to
expand the dict to a list:

>>> data=dict(a=3,b=2,c=4)
>>> data
{'a': 3, 'c': 4, 'b': 2}
>>> reduce(lambda x,y: x+y,[[x[0]]*x[1] for x in data.items()] )
['a', 'a', 'a', 'c', 'c', 'c', 'c', 'b', 'b']
>>> 

Cheers, Paddy.

Donald 'Paddy' McCarthy



More information about the Python-list mailing list