urn scheme
Duncan Smith
buzzard at urubu.freeserve.co.uk
Mon Jul 28 09:38:21 EDT 2003
Hello,
I'm currently doing some simulations where I need to sample without
replacement from a list of counts. (Actually it's a Numeric array, but I'm
currently converting the array to a list, using the following function and
converting back to an array.) I *desperately *need to speed this up
(simulations currently take over 24 hours). This sampling is a real
bottleneck, and accounts for about 90% of the time cost. If I can get a
10-fold speed up I might be in business, and a 100-fold speed up might even
allow me to meet a deadline. Using psyco appears to give me no speed up
whatsoever (that can't be right, can it?). I am currently working on
reducing the number of function calls to urn(), but the biggest speed up
will still come from improving the performance of this function. Any ideas
for a significant speed up? TIA.
Duncan
-----------------------------------------------------------
import random
def urn(bins, draws, drawn=0, others=0):
"""Bins is a sequence containing the numbers
of balls in the urns. 'drawn' and 'others'
define how the contents of the bins should
be updated after each draw in draws. The
default arguments correspond to sampling
with replacement. e.g. drawn = -1 corresponds
to sampling without replacement"""
len_bins = len(bins)
res = [0] * len_bins
cum_bins = [bins[0]] + [0] * (len_bins-1)
for i in range(1, len_bins):
cum_bins[i] = cum_bins[i-1] + bins[i]
for i in range(draws):
balls = random.random() * cum_bins[-1]
adj = 0
for j in range(len_bins):
if adj == 0 and cum_bins[j] > balls:
res[j] += 1
cum_bins[j] += drawn
adj = 1
else:
cum_bins[j] += (j - adj) * others + adj * drawn
return res
----------------------------------------------------------------
e.g.
>>> urn((0,2,5,3,4), 10, -1)
[0, 1, 4, 1, 4]
>>>
More information about the Python-list
mailing list