Random Drawing Simulation -- performance issue

David J. Braden dbraden at invalid.add
Wed Sep 13 18:52:06 CEST 2006


David J. Braden wrote:
> Travis E. Oliphant wrote:
>> Brendon Towle wrote:
>>> I need to simulate scenarios like the following: "You have a deck of  
>>> 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,  
>>> replace it, and repeat N times."
>>>
>>
>> Thinking about the problem as drawing sample froms a discrete 
>> distribution defined by the population might help.
>>
>> For example, in SciPy you can define your own discrete random variable:
>>
>> var = scipy.stats.rv_discrete(name='sample', 
>> values=([0,1,2],[3/10.,5/10.,2/10.]))
>>
>> Then
>>
>> var.rvs(size=10000)
>>
>> would quickly return an array of "draws"
>>
>> If you didn't want to install SciPy, but were willing to install 
>> NumPy, then the crux of the algorithm is to generate an entire array 
>> of uniform random variables:  numpy.random.rand(count) and then map 
>> them through the inverse cumulative distribution function to generate 
>> your samples.  The inverse cumulative distribution function is just a 
>> series of steps whose width depends on the probablity distribution.
>>
>> Thus, the population defines the draw boundaries.  To make it easy, 
>> just multiply the uniform random number by the total number of cards 
>> and then the boundaries are on the integers of the number of each kind 
>> of card.
>>
>> Here is an implementation.  I find this version to be 2x - 5x faster 
>> depending on how many draws are used.
>>
>>
>> import numpy as N
>>
>> def randomDrawing_N(count, population):
>>     probs = [x[0] for x in population]
>>     res = [[0, item[1]] for item in population]
>>     nums = N.random.rand(count)
>>     cprobs = N.cumsum([0]+probs)
>>     nums *= cprobs[-1]
>>     for k, val in enumerate(probs):
>>         res[k][0] += ((nums > cprobs[k]) & (nums < cprobs[k+1])).sum()
>>     return res
>>
>>
>> -Travis Oliphant
>>
> 
> In response to what Travis and Simon wrote -
> (1) Where the heck can I find a description of scipy's stat functions? 
> Documentation on these seems sparse.
> 
> (2) How does one set up a good timer for Python as implemented for 
> Windows? (i.e., how can I make API calls to Windows from Python?)
> 
> Please bear with me, or not; I am /just/ starting off with Python.
> 
> Thanks,
> Dave Braden

Oops, or rather, duh.
Rereading Robert's post clarified for me where multinomial is, and that 
to get help I need to specify more than simply help(multinomial). 
Sheesh, I'm dim today.

My question re timing code stands.

TIA,
DaveB



More information about the Python-list mailing list