Hypergeometric distribution

Bengt Richter bokr at oz.net
Wed Jan 4 21:00:43 EST 2006

On 4 Jan 2006 12:46:47 -0800, "Raven" <balckraven at gmail.com> wrote:

>Cameron Laird wrote:
>> This thread confuses me.
>> I've lost track of the real goal.  If it's an exact calculation of
>> binomial coefficients--or even one of several other potential
>> targets mentioned--I echo Steven D'Aprano, and ask, are you *sure*
>> the suggestions already offered aren't adequate?
>Hi Cameron, my real goal was to calculate the hypergeometric
>distribution. The problem was that the  function for hypergeometric
ISTM that can't have been your "real goal" -- unless you are e.g. preparing numeric
tables for publication. IOW, IWT you probably intend to USE the hypergeometric
distribution values in some useful way to go towards your "real goal." ;-)

The requirements of this USE are still not apparent to me in your posts, though
that may be because I've missed something.

>calculation from scipy uses the scipy.comb function which by default
>uses floats so for large numbers comb(n,r) returns inf. and hence the
>hypergeometric returns nan.
>The first suggestion, the one by Robert Kern,  resolved my problem:
>Raven wrote:
>>Thanks to all of you guys, I could resolve my problem using the
>>logarithms as proposed by Robert.
>Then the other guys gave alternative solutions so I tried them out. So
>form me the suggestions offered are more than adequate :-)
>Cameron Laird wrote:
>>Also, I think you
>> might not realize how accurate Stirling's approximation (perhaps to
>> second order) is in the range of interest.
>The problem with Stirling's approximation is that I need to calculate
>the hypergeometric hence the factorial for numbers within a large range
>e.g. choose(14000,170) or choose(5,2)
It seems you are hinting at some accuracy requirements that you haven't
yet explained. I'm curious how you use the values, and how that affects your
judgement of Stirling's approximation. In fact, perhaps the semantics of your
value usage could even suggest an alternate algorithmic approach to your actual end result.

Bengt Richter

More information about the Python-list mailing list