Calculating factorial

Tue Dec 5 04:04:22 EST 2000

"Joshua Muskovitz" <josh at open.com> wrote in message
news:3a2c8c2a_3 at corp.newsfeeds.com...
> What about using xrange() instead of range() to support *really* large
> numbers?

Of marginal usefulness in the context of computing factorials.

The memory space taken by a number X grows, roughly, with the
logarithm of X.  From Stirling's formula, we know that
    log(N!) about= N log(N) - N
for large N; for N large enough, this means that N! takes up
memory space growing as O(N logN).

Since range(N) takes up memory space growing as O(N), for a
large enough N this memory consumption becomes irrelevant
compared to the memory which N! itself takes up.

There _may_ be a value or two of N such that you have just
enough memory to compute N! if you carefully husband every
little bit, which is why the usefulness of xrange is
"marginal" rather than "non-existent".  But I doubt you'd
want to wait for the computation times needed for such
calculations on a typical machine of today (with a few
hundreds megabytes and several hundreds Meaningless
Instructions Per Second), particularly with the multiplication
algorithm that is used by Python's builtin 'long integer'
numbers.

For example, 10000! is a 35,600-digits number (starting
with 284) and (on my old box) is computer in just over
2 seconds with the simple loop using Python longs, half
a second with the loop using gmpy.mpz integers, and just
under 1/4 of a second using gmpy.fac (the gmpy module is
at http://gmpy.sourceforge.net/).

Here, the memory used for range(10000) is a piddling
40k bytes or so, and the memory used for the multiplicand
and result at the final step (about 70,000 digits between
them) is of similar magnitude (perhaps 20+ kbytes).

20000! is a 77,338 digits number (starting with 181) and
its computation takes (again, on my specific box) a bit
more than 11 seconds with ordinary Python long integers,
more than 3 seconds usign gmpy.mpz, and 0.6 seconds using
gmpy.fac.  The memory is still a tiny amount, computation
times are beginning to become noticeable (and growing
far faster than memory consumption, which, all in all, is
only marginally worse than linear).

40000! is a 167,714 digits number (starting with 209) and
its computation takes over 54 seconds with Python long ints,
over 9 with gmpy.mpz's, 1.6 with gmpy.fac.

We can see that a doubling of N means (roughly) a doubling of
the needed memory, with a computation about 5 times slower
for Python longs, 3 to 4 times for gmpy.mpz, 2 to 3 times
for gmpy.fac.  And we're still below 1/4 of a megabyte of
space taken -- with 8 more doublings left to get to a still
reasonably modest 64 MBytes of memory space, we could well
extrapolate that computing the factorial of 10,000,000 would
take over 270 days of time with Python longs, and be a pretty
big problem even for gmpy.fac.  (gmpy.fac actually takes
150 seconds to compute the factorial of 1,000,000, which is
a 5,565,709-digits number).

Alex