# [Edu-sig] Pythonic Math must include...

Gregor Lingl gregor.lingl at aon.at
Mon Jan 19 00:31:36 CET 2009

```kirby urner schrieb:
> I think you're correct that the sieve best works with a pre-specified
> finite domain:
>
....

>> To continue work in this area one (or at least me) has to have
>> some criteria to judge the solutions.
>> these criteria in the community.
>>
>
> Fortunately, we have hundreds of years of math pedagogy, so in terms
> have" and just render it Pythonically.
....

>> There should be some criteria concerning
>> (a) the choice of problems and themes,
>>    e.g. to prefer small problems that expose a single idea  -  or rather not
>> ...   etc.,
>> as well as some
>> (b) code related criteria, like clarity, conciseness, efficiency, beauty (!)
>> etc., ranked according to
>> their priorities.
>>
>
> This will be up to each professional teacher in whatever walk of life
> -- to judge what to include and what to exclude.  Each teacher will
> find her or himself in agreement with some, disagreement with others,
> over what to include.  Twas ever thus.
>

I think it's not that easy. I'd like to dive a bit into this topic,
resuming the code examples
efficiency, I've
measured the time to compute the 9592/9593 primes in the range up to
100000 and (in order to
get some idea how the algorithm scales) also those up to 500000 (on my
machine).

Here is your, Kirby's, code:

def primes():
sofar = [-1, 2,3] # a running start, -1 proposed by J.H. Conway
yield sofar[0] # get these out of the way
yield sofar[1] # the only even prime
yield sofar[2] # and then 3
candidate = 5 # we'll increment from here on
while True: # go forever
for factor in sofar[1:]: # skip -1 (or don't use it in the first
place)
if factor ** 2 > candidate:  # did we pass?
yield candidate # woo hoo!
sofar.append(candidate) # keep the gold
break  # onward!
if not candidate % factor: # oops, no remainder
break  # this is a composite
candidate += 2 # next odd number please

Time:    100000:  1.71 s                500000:  42.72 s
-----

Michel Paul's code:

def primes():
sofar = [-1, 2,3] # a running start, -1 proposed by J.H. Conway
yield sofar[0] # get these out of the way
yield sofar[1] # the only even prime
yield sofar[2] # and then 3
candidate = 5 # we'll increment from here on
while True: # go forever
for factor in sofar[1:]: # skip -1 (or don't use it in the first
place)
if factor ** 2 > candidate:  # did we pass?
yield candidate # woo hoo!
sofar.append(candidate) # keep the gold
break  # onward!
if not candidate % factor: # oops, no remainder
break  # this is a composite
candidate += 2 # next odd number please

Time:    100000:  1.58 s                500000:  32.25 s
-----

I've modified this one slightly, with a surprising effect
(I conjecture mainly due to avoiding copying p repeatedly):

def primes():
yield(-1)
p = [2, 3]
for x in p: yield x
def is_prime(n):
for factor in p:
if factor**2 > n: return True
if n%factor == 0: return False
multiple = 6
while True:
for cand in multiple-1, multiple+1:
if is_prime(cand):
yield cand
p.append(cand)
multiple += 6

Time:    100000:  0.14 s                500000:  1.05 s
-----

Scott Daniels code:

def primes():
for x in -1, 2, 3:
yield x
gen = primes().next
for top in iter(gen, 3):
pass # skip useless tests (we skip all multiples of 2 or 3)
factors = []
# get pump ready for a 5
check = -1
limit = 3 * 3 - 2 # How far will 3 work as top prime?
factors = []
while True:
check += 6
if check >= limit: # move if this pair needs another factor
top = gen()
limit = top * top - 2 # limit for both candidates
factors.append(top)
for element in check, check + 2:
for factor in factors:
if element % factor == 0:
break
else:
yield element

Time:    100000:  0.07 s                500000:  0.50 s
-----

Compare the above generators to sieve algorithms:

G.L. minimal sieve

def primes(n):
s = set(range(3,n+1,2))
for m in range(3, int(n**.5)+1, 2):
s.difference_update(range(m*m, n+1, 2*m))
return [2]*(2<=n) + sorted(s)

Time:    100000:  0.014 s                500000:  0.11 s
-----

G.L.: sieve

def primes(n):
s = set(range(3,n+1,2))
m=3
while m * m < n:
s.difference_update(range(m*m, n+1, 2*m))
m += 2
while m not in s: m += 2
return sorted(s)

Time:    100000:  0.012 s                500000:  0.086 s
-----

Apparently sieves are considerably faster at the cost
that the whole sieve has to be calculated before the primes
can be retrieved. Not a disadvantage if you only want to know
the n-th prime.

So this suggests to make use of the sieve-paradigm also for
prime generators. Here is a rather primitive example:

G.L. sieve based generator:

def primes():
CHUNKSIZE = 5000 # arbitrary even
primes = []
yield 2

# first chunk
candidates = range(3,CHUNKSIZE,2)
chunk = set(candidates)
for m in candidates:
if m*m > CHUNKSIZE:
break
chunk.difference_update(range(m*m, CHUNKSIZE, m))
chunk = sorted(list(chunk))
for p in chunk:
yield p
primes.extend(chunk)
lower = CHUNKSIZE

# more chunks
while True:
upper = lower + CHUNKSIZE
chunk = set(range(lower+1,upper,2))
for m in primes:
if m*m > upper:
break
chunk.difference_update(range(lower + m - lower%m, upper, m))
chunk = sorted(list(chunk))
for p in chunk:
yield p
primes.extend(chunk)
lower = upper

Time:    100000:  0.023 s                500000:  0.113 s
-----

Of course there are many more possibilities to realize this, for
instance to use slice-assignment instead of sets. This would
require some somewhat opaque index calculations, but be - as far
as I rember - even faster. Particularly I suppose that there
exist faster algorithms for generating primes, but I admittedly
don't know them.

======================================================

Summing up:

Kirby                          1.71 s                  42.72 s
Michel Paul                    1.58 s                  32.25 s
Michel Paul modified::         0.14 s                   1.05 s
Scott Daniels:                 0.07 s                   0.50 s
G.L. minimal sieve:            0.014 s                  0.109 s
G.L. sieve:                    0.012 s                  0.086 s
G.L. sieve-based generator:    0.023 s                  0.113 s
(performance depends on CHUNKSIZE)

This exposes in my opinion an unsurmountable dilemma, namely
that usually you cannot meet even those few criteria mentioned
in the beginning in a single solution.

So under the aspects you exposed at the beginning of this thread,
"Pythonic Math", which title in some sense includes and/or addresses
this dilemma, how would you prefer to weight those criteria?

Regards

Gregor

```