# Prime number algo... what's wrong?

Tim Peters tim_one at email.msn.com
Sun Mar 30 05:44:08 CEST 2003

```[Alex Martelli, adds a cool refinement to the Python Cookbook sieve]
> ...
> Alternatively, therefore, one might choose a tiny variant for D's
> values, keeping them to single numbers but also ensuring that an
> entry is never duplicated:  it suffices to change the "while True:"
> loop's body to:
>
>         p = D.pop(q, None)
>         if p:
>             x = p + q
>             while x in D: x += p
>             D[x] = p
>         else:
>             D[q*q] = q
>             yield q
>         q += 1

The coolest thing about this dict-based sieve is that it takes storage
proportional to the number of primes less than N (== the maximum prime
desired), instead of storage proportional to N (which "the usual" sieve
algorithms require).  There's a well-known priority-queue sieve algorithm
that takes storage proportional to the number of primes less than sqrt(N).
Knuth Volume 3 spells that one out in an exercise (look in the index for
"prime", and follow the first reference).

Just to be extreme, here's a trick-laden version of that one using a dict
instead.  The tricks have mostly to do with avoiding multiples of 2 and 3.
Modulo the trickery (common to both), I find the dict-based version much
easier to follow than the priority-queue based one (which I won't show
here) -- and it turns out it's much faster too, at least in Python:

def sieve3(maximum):
"Generate all primes <= maximum."
if maximum >= 2:
yield 2
if maximum >= 3:
yield 3
# candidate takes on all integer values > 3 that aren't divisible by
# 2 or 3.
candidate = 5
delta = 2  # flips between 2 and 4
# Map i to (d, 6*p), where d is 2*p or 4*p, p is a prime dividing i,
# and i+d is the next multiple of p larger than i not divisible by
# 2 or 3.  As the algorithm proceeds, f will contain one entry for
# each prime <= sqrt(maximum) discovered so far (excepting 2 and 3).
f = {}
fetch = f.pop
adding = candidate**2 <= maximum
while candidate <= maximum:
if candidate in f:
d, s = fetch(candidate)
# candidate + d is next multiple of s/6 that isn't divisible
# by 2 or 3
i = candidate + d
d = s-d
while i in f:
i += d
d = s-d
f[i] = d, s
else:
yield candidate
sq = candidate**2
f[sq] = delta * candidate, 6 * candidate
adding = sq <= maximum
candidate += delta
delta = 6 - delta

This kind of algorithm seems nearly inconceivable if dicts aren't part of
your mental toolkit.  Every good algorithm contains a dict <wink>.

```