How can I speed up a script that iterates over a large range (600 billion)?

Ian Kelly ian.g.kelly at gmail.com
Wed Jun 22 03:16:36 EDT 2011


On Tue, Jun 21, 2011 at 11:58 PM, Chris Torek <nospam at torek.net> wrote:
> I was curious about implementing prime factorization as a generator,
> using a prime-number generator to come up with the factors, and
> doing memoization of the generated primes to produce a program that
> does what "factor" does, e.g.:

This is a generator-based sieve I wrote a while back to solve the
PRIME1 problem at SPOJ.  The problem is to generate all the prime
numbers within specified ranges, where the numbers are great enough
that a full sieve would run out of memory, and the ranges are wide
enough that a O(sqrt(n)) test on each number would simply take too
long:

https://www.spoj.pl/problems/PRIME1/

The script is not terribly impressive from a technical standpoint, but
what tickles me about it is its bootstrappiness; the set that the
"primes" generator checks to determine whether each number is prime is
actually built from the output of the generator, which itself contains
no actual primality-testing logic.  Hope you like it:

8<--------------------------------------------------------------------
import math

def primes(m, n):
    # Yield all the primes in the range [m, n), using the nonprimes set
    # as a reference.  Except for 2, only odd integers are considered.
    if m <= 2:
        yield 2
        m = 3
    elif m % 2 == 0:
        m += 1  # Force m to be odd.
    for p in xrange(m, n, 2):
        if p not in nonprimes:
            yield p

# Read all the bounds to figure out what we need to store.
bounds = [map(int, raw_input().split(' ')) for t in xrange(input())]
limit = max(n for (m, n) in bounds)
sqrt_limit = int(math.sqrt(limit))

# Mark odd multiples of primes as not prime.  Even multiples
# do not need to be marked since primes() won't try them.
nonprimes = set()
for p in primes(3, sqrt_limit+1):
    # Mark odd nonprimes within the base range.  p*3 is the first
    # odd multiple of p; p+p is the increment to get to the next
    # odd multiple.
    nonprimes.update(xrange(p*3, sqrt_limit+1, p+p))

    # Mark odd nonprimes within each of the requested ranges.
    for (m, n) in bounds:
        # Align m to the first odd multiple of p in the range
        # (or the last odd multiple before the range).
        m -= (m % (p + p) - p)
        m = max(m, p*3)
        nonprimes.update(xrange(m, n+1, p+p))

# Generate and write the primes over each input range.
first = True
for (m, n) in bounds:
    if not first:
        print
    first = False
    for p in primes(m, n+1):
        print p
8<--------------------------------------------------------------------



More information about the Python-list mailing list