[Edu-sig] Speaking of factorials...

Mon, 28 May 2001 19:01:35 -0400

[Kirby Urner]
> While we're on the subject, I just wanted to go over
> something about the combination function, defined as
>
>                       n!
>    combo(n,k) =    --------
>                    k!(n-k)!
> ...
> But it's wasteful to implement this expression literally,
> ...
> Using Guido's little timer guy from his excellent essay at
> http://www.python.org/doc/essays/list2str.html (scroll down),
> we can test the relative efficacy if canceling (goodcombo),
> vis ignoring the opportunity (badcombo):
>
>  >>> def goodcombo(n,k):
>         return reduce(mul,range(n,n-k,-1))/reduce(mul,range(1,k+1))
>
>  >>> def badcombo(n,k):
> 	 return (reduce(mul,range(1,n+1))/
>      	       (reduce(mul,range(1,k+1))*reduce(mul,range(1,n-k+1))))

Oh, Kirby, you have got to get over your unhealthy fascination with reduce
<wink>.  Seriously, this code is unreadable.  Python is supposed to be
pretty!  Here's a tease:

goodcombo 1.506
badcombo 1.659
shortcomb 0.615

"shortcomb" is shown later.  No reduce.

> I rest my case.

Ditto <wink>.

>   >>> timing(goodcombo,500,10,6)
>   goodcomb 0.275
>   >>> timing(badcombo,500,11,6)
>   badcomb 0.453

Given the way goodcombo and badcombo work, passing 10,6 to goodcombo but
11,6 to badcombo gave goodcombo an unfair advantage.  Here's the reduce-free
shortcomb:

def shortcomb(m, n):
    """m, n -> number of combinations of m items, n at a time.

    m >= n >= 0 required.  Doesn't check its arguments.  Assumes nothing
    will overflow -- tough luck if it does.
    """

    if n > (m >> 1):
        n = m-n
    if n == 0:
        return 1
    result = m
    i = 2
    m, n = m-1, n-1
    while n:
        # assert (result * m) % i == 0
        result = result * m / i
        i += 1
        n -= 1
        m -= 1
    return result

Timing was done via

timing(goodcombo, 5000, 11, 6)
timing(badcombo, 5000, 11, 6)
timing(shortcomb, 5000, 11, 6)

Now in practice I don't use shortcomb either, for reasons you might be able
to guess from the snotty comments I put in its docstring <wink>.  Here's the
version I actually use, and heartily recommend it.  It's slower than
badcombo!  But it complains if you feed it bad arguments, and uses bigint
arithmetic so that overflow isn't a concern (which, btw, is the real reason
it's slower):

def _chop(n):
    """n -> int if it fits, else long."""

    try:
        return int(n)
    except OverflowError:
        return n

def comb(m, n):
    """m, n -> number of combinations of m items, n at a time.

    m >= n >= 0 required.
    """

    if not m >= n >= 0:
        raise ValueError("m >= n >= 0 required: " + `m, n`)
    if n > (m >> 1):
        n = m-n
    if n == 0:
        return 1
    result = long(m)
    i = 2
    m, n = m-1, n-1
    while n:
        # assert (result * m) % i == 0
        result = result * m / i
        i += 1
        n -= 1
        m -= 1
    return _chop(result)

Even shortcomb is less prone to spurious overflows than goodcombo or
badcombo, though, because it divides out each part of the denominator just
as soon as it's certain the numerator is divisible by it:

>>> badcombo(52, 7)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "timecomb.py", line 60, in badcombo
    return (reduce(mul,range(1,n+1))/
OverflowError: integer multiplication
>>> goodcombo(52, 7)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "timecomb.py", line 57, in goodcombo
    return reduce(mul,range(n,n-k,-1))/reduce(mul,range(1,k+1))
OverflowError: integer multiplication
>>> shortcomb(52, 7)
133784560
>>>