Re: [Numpy-discussion] categorical distributions

22 Nov 2010


      On Mon, Nov 22, 2010 at 6:05 AM, Hagen Fürstenau  wrote:
...
...
ISTM that this elementary functionality deserves an implementation
that's as fast as it can be.
To substantiate this, I just wrote a simple implementation of
"categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster
than your version for multiple samples of the same distribution and more
than 3x faster than using "multinomial(1, ...)" for multiple samples of
different distributions (each time tested with 1000 samples drawn from
distributions over 1000 categories).
I can provide it as a patch if there's any interest.
Can you compare the speed of your cython solution with the version of Chuck

--
For instance, weight 0..3 by 1..4, then

In [14]: w = arange(1,5)

In [15]: p = cumsum(w)/float(w.sum())

In [16]: bincount(p.searchsorted(random(1000000)))/1e6
Out[16]: array([ 0.100336,  0.200382,  0.299132,  0.40015 ])

-------------
from numpy mailing list thread "Weighted random integers", sep. 10
Using searchsorted hat roughly a 10 times speedup compared to my
multinomial version

Josef
...
- Hagen
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] categorical distributions

josef.pktd＠gmail.com