Confusing performance results for prime

Fri Oct 17 16:33:24 EDT 2003

In doing some testing of different but simple algorithms for getting a
list of prime numbers, I ended up getting some results that seem a bit
contradictory.  Given the following test program (testPrimes.py) with
two algorithms that both check for primes by testing only odd numbers
using factors up to the square root of the value, where Primes1 is based
on all of the existing primes so far, and Primes2 is based on all odd
numbers, I would expect that, as the number of primes we check gets
larger, that Primes1 should get more & more efficient (since it should
be performing less tests for each number).  However the ratio seems to
be relatively consistent.  Another thing that is interesting is that
once I tested up to n=200000, I got an exception in Primes2.  I added
the long typecast in the 'limit='  statement for both versions and the
resulting improvements were strange: Primes1 runs 10-20% SLOWER, and
Primes2 runs about 50% FASTER!

I am not looking to come up with the most optimized solution to prime
numbers here.  Primes1 is faster than Primes2 as I would expect and it
is clear, relatively elegant, and sufficient for me.  However, I can't
figure out why the ratio doesn't improve with size, and the strange
results of the long typecast, and I would like to be able to understand
what is causing their behavior.  Any ideas?  Thanks,

-- 
Greg

#------------------------------------------------------------
import math, time

def Primes1(y):
    primes = [3]
    for p in range(5, y, 2):
        limit = math.sqrt(p)    # or:   long(math.sqrt(p))
        for factor in primes:
            if factor > limit:      # then number is prime
                primes.append(p)
                break
            elif p % factor == 0:   # then it's *not* prime
                break
    return [2] + primes

def Primes2(y):
    primes = [2,3]
    for p in range(5, y, 2):
        limit = math.sqrt(p) + 1        # or:  long(math.sqrt(p)) + 1
        # check one extra to let us see if there were no factors
        for n in range(3,limit+3,2):
            if  p % n == 0:         # then it's *not* prime
                break
        if n > limit:
            primes.append(p)        # then number is prime
    return primes

if __name__ == "__main__":
    n=100000
    tb = time.time()
    l1 = Primes1(n)
    t1 = time.time() - tb
    print 'Primes1: %s primes up to %s in %5.3f secs' % (len(l1), n, t1)
    tb = time.time()
    l2 = Primes2(n)
    t2 = time.time() - tb
    print 'Primes2: %s primes up to %s in %5.3f secs' % (len(l2), n, t2)
    print 'ratio: %5.2f' % tt

#------------------------------------------------------------

I get the following results (avg for 3 runs each using Python 2.3.2 on
Win XP Pro):

                 n=50000    n=100000      n=200000
w/out long:
-----------
Primes1:        0.41        0.95        2.24
Primes2:        1.90        4.21        9.28
Ratio:            4.63        4.42        4.14

w/ long:
-----------
Primes1:        0.48        1.12        2.60
Primes2:        0.84        2.04        4.75
Ratio:            1.75        1.82        1.82