Hypergeometric distribution
Steven D'Aprano
steve at REMOVETHIScyber.com.au
Sun Jan 1 21:31:23 EST 2006
On Sun, 01 Jan 2006 14:24:39 -0800, Raven wrote:
> Thanks Steven for your very interesting post.
>
> This was a critical instance from my problem:
>
>>>>from scipy import comb
>>>> comb(14354,174)
> inf
Curious. It wouldn't surprise me if scipy was using floats, because 'inf'
is usually a floating point value, not an integer.
Using my test code from yesterday, I got:
>>> bincoeff(14354,174)
11172777193562324917353367958024437473336018053487854593870
07090637489405604489192488346144684402362344409632515556732
33563523161308145825208276395238764441857829454464446478336
90173777095041891067637551783324071233625370619908633625448
31076677382448616246125346667737896891548166898009878730510
57476139515840542769956414204130692733629723305869285300247
645972456505830620188961902165086857407612722931651840L
Took about three seconds on my system.
> Yes I am calculating hundreds of hypergeometric probabilities so I
> need fast calculations
Another possibility, if you want exact integer maths rather than floating
point with logarithms, is to memoise the binomial coefficients. Something
like this:
# untested
def bincoeff(n,r, \
cache={}):
try:
return cache((n,r))
except KeyError:
x = 1
for i in range(r+1, n+1):
x *= i
for i in range(1, n-r+1):
x /= i
cache((n,r)) = x
return x
--
Steven.
More information about the Python-list
mailing list