# Weighted "random" selection from list of lists

Peter Otten __peter__ at web.de
Sat Oct 8 21:04:32 CEST 2005

```Jesse Noller wrote:

> I'm probably missing something here, but I have a problem where I am
> populating a list of lists like this:
>
> list1 = [ 'a', 'b', 'c' ]
> list2 = [ 'dog', 'cat', 'panda' ]
> list3 = [ 'blue', 'red', 'green' ]
>
> main_list = [ list1, list2, list3 ]
>
> Once main_list is populated, I want to build a sequence from items
> within the lists, "randomly" with a defined percentage of the sequence
> coming for the various lists. For example, if I want a 6 item
> sequence, I might want:
>
> 60% from list 1 (main_list[0])
> 30% from list 2 (main_list[1])
> 10% from list 3 (main_list[2])
>
> I know how to pull a random sequence (using random()) from the lists,
> but I'm not sure how to pick it with the desired percentages.

If the percentages can be normalized to small integral numbers, just make a
pool where each list is repeated according to its weight, e. g.
list1 occurs 6, list2 3 times, and list3 once:

pools = [list1, list2, list3]
weights = [6, 3, 1]
sample_size = 10

weighted_pools = []
for p, w in zip(pools, weights):
weighted_pools.extend([p]*w)

sample = [random.choice(random.choice(weighted_pools))
for _ in xrange(sample_size)]

Another option is to use bisect() to choose a pool:

pools = [list1, list2, list3]
sample_size = 10

def isum(items, sigma=0.0):
for item in items:
sigma += item
yield sigma

cumulated_weights = list(isum([60, 30, 10], 0))
sigma = cumulated_weights[-1]

sample = []
for _ in xrange(sample_size):
pool = pools[bisect.bisect(cumulated_weights, random.random()*sigma)]
sample.append(random.choice(pool))

(all code untested)

Peter

```

More information about the Python-list mailing list