Prime number algo... what's wrong?
Tim Peters
tim_one at email.msn.com
Sat Mar 29 22:44:08 EST 2003
[Alex Martelli, adds a cool refinement to the Python Cookbook sieve]
> ...
> Alternatively, therefore, one might choose a tiny variant for D's
> values, keeping them to single numbers but also ensuring that an
> entry is never duplicated: it suffices to change the "while True:"
> loop's body to:
>
> p = D.pop(q, None)
> if p:
> x = p + q
> while x in D: x += p
> D[x] = p
> else:
> D[q*q] = q
> yield q
> q += 1
The coolest thing about this dict-based sieve is that it takes storage
proportional to the number of primes less than N (== the maximum prime
desired), instead of storage proportional to N (which "the usual" sieve
algorithms require). There's a well-known priority-queue sieve algorithm
that takes storage proportional to the number of primes less than sqrt(N).
Knuth Volume 3 spells that one out in an exercise (look in the index for
"prime", and follow the first reference).
Just to be extreme, here's a trick-laden version of that one using a dict
instead. The tricks have mostly to do with avoiding multiples of 2 and 3.
Modulo the trickery (common to both), I find the dict-based version much
easier to follow than the priority-queue based one (which I won't show
here) -- and it turns out it's much faster too, at least in Python:
def sieve3(maximum):
"Generate all primes <= maximum."
if maximum >= 2:
yield 2
if maximum >= 3:
yield 3
# candidate takes on all integer values > 3 that aren't divisible by
# 2 or 3.
candidate = 5
delta = 2 # flips between 2 and 4
# Map i to (d, 6*p), where d is 2*p or 4*p, p is a prime dividing i,
# and i+d is the next multiple of p larger than i not divisible by
# 2 or 3. As the algorithm proceeds, f will contain one entry for
# each prime <= sqrt(maximum) discovered so far (excepting 2 and 3).
f = {}
fetch = f.pop
adding = candidate**2 <= maximum
while candidate <= maximum:
if candidate in f:
d, s = fetch(candidate)
# candidate + d is next multiple of s/6 that isn't divisible
# by 2 or 3
i = candidate + d
d = s-d
while i in f:
i += d
d = s-d
f[i] = d, s
else:
yield candidate
if adding:
sq = candidate**2
f[sq] = delta * candidate, 6 * candidate
adding = sq <= maximum
candidate += delta
delta = 6 - delta
This kind of algorithm seems nearly inconceivable if dicts aren't part of
your mental toolkit. Every good algorithm contains a dict <wink>.
More information about the Python-list
mailing list