[Tutor] RE: Prime Numbers

Wed Feb 26 16:20:04 2003

Ike Hall wrote:

>First, I decided that what I would do is define a function,
>prime(x,y) that returned a tuple containing all primes between x and y.
>code below:
>

A few (hopefully instructive) style comments on your code, here...  :)

>def prime(x,y):
>  #first, I want to be sure that I have numbers for x and y
>
>  if type(x) != IntType or FloatType or LongType:
>    print 'I need numbers people!'
>    return  None
>  elif type(y)!= IntType or FloatType or LongType:
>    print 'I need numbers people!'
>    return None
>  else:
>    pass
>

Typechecking like this really isn't good practice.  What if someone else 
introduces a new numeric type?  They might still reasonably want to pass 
that in, and it may well support all the operations you need, but your 
typechecking will exclude it needlessly.  One of the Python maxims is 
"It's easier to ask forgiveness than permission" -- which usually 
translates to trying what you need to do, and catching any exceptions if 
it doesn't work.

All we need to do is determine whether our input can handle division, 
really.  So we can simply test it like this:

    try:
        temp = x / 2
        temp = y / 2
    except:
        print "Arguments must be numeric values!"
        return

(Note that 'return None' does exactly the same thing as 'return' by 
itself.)  This means we don't need to import types at all, and results 
in clearer and more generic code.  (Any time I see 'elif' I start 
wondering if there might not be a better way to organize things...)

>  #next, we want to find out which is smaller, and make #sure that is 'x'
>  if x>y:
>    dummy = x
>     x=y
>     y=dummy
>

Python can actually do this without using a dummy intermediate variable:

        if x > y:
            x, y = y, x

>  #now we also want to make sure that there is at least one
>  #integer between x and y (floats cannot be primes, but
>  # primes can lie between 2 floats if there is an integer
>  #between them
>

Is it really necessary to special-case this?  Two floats that are that 
close together have no primes between them, true, but is that 
significantly different from having inputs of, say, 7.9 and 10.1 where 
there are no primes between them?  I'd say skip this check, and simply 
return an empty list in this case (as with any other case where no 
primes are found).

>  #now we have done all the checks we need, so we now turn our numbers into
>integers if they are not already
>  #we add one to int(x) or long(x) to get the next integer, since we want
>primes between x and y
>  if type(x)==FloatType:
>     try:
>          x=int(x)+1
>     except(OverflowError):
>          x=long(x)+1
>
>  if type(y)==FloatType:
>     try:
>           y=int(y)
>     except(OverflowError):
>           y=long(y)
>

Of course, your previous code already tried to do int(x) without 
catching any exceptions, so if you were going to get an OverflowError 
you'd already have done so.  ;)  (Note, though, that my suggested code 
doesn't care so far whether something is an int or a long, and wouldn't 
have this problem.)  But really, is there a significant advantage to 
using ints instead of longs?  Why not just automatically use longs for 
everything?  Once again, this gets you away from specific typechecking. 
 Or, since fractions are really all we have to worry about (whether 
they're floats or third-party arbitrary-precision decimals or 
third-party rational numbers or...)

    if x != long(x):
        x = long(x) + 1
    if y != long(y):
        y = long(y) + 1

This takes advantage of the fact that Python is smart about comparing 
different numeric types, and keeps everything generic.  You're passing 
up a bit of efficiency in the case of x and y being fractions within the 
range of MAXINT, since longs are slower than ints, but this seems like a 
rare enough case to not worry overmuch about.  (Premature optimization 
is the root of all evil!)

>  primes=[]
>  if 1 in range(x,y):primes.append(1)
>  if 2 in range(x,y):primes.append(2)
>

Instead of generating two lists, and checking for membership in them, 
why not do straight numeric comparisons?

    if x <= 1 <= y:  primes.append(1)
    if y <= 2 <= y:  primes.append (2)

>  if y<=3:
>    if y==3:primes.append(3)
>    return primes
>  if x<3:start=3
>  elif x%2: start=x
>  else:start=x+1
>  for i in range(start,y,2):
>    high=int(math.sqrt(i))   #we only check between 2 and sqrt(i)
>    primeflag=1
>    for j in range(2, high):
>       if not i%j:
>             primeflag=0
>       else:
>             pass
>    if primeflag:
>        primes.append(i)
>

Boy, that's complicated, and it's going to be doing a lot of extra work, 
with the inner loop testing numbers that we already know aren't prime. 
 I'd do this a bit differently.  I'd generate all prime numbers between 
1 and y (because you can use the earlier primes to simplify searches for 
later primes -- remember that once you check if all prime numbers are 
factors, you've checked non-prime numbers by implication, as a non-prime 
factor must have smaller prime factors) and then throw away all the 
numbers below x.

This method also means that we don't need to specifically check that 1 
and 2 are greater than x, as well.  In fact, if we also throw away any 
primes that are greater than y, we can just start with 1 and 2.  So 
let's completely rewrite this segment of the function.

    primes = [1, 2]
    for candidate in range(3, y, 2):
        limit = sqrt(candidate)
        for factor in primes:
            if factor > limit:    # then number is prime
                primes.append(candidate)
                break
            elif candidate % factor == 0:   # then it's *not* prime
                break

    primes = [item for item in primes if x <= item < y]
    return tuple(primes)

Note that, in both the range() statement and the list comp that makes 
the final trim, if y is a prime number it will *not* be in the returned 
tuple.  This is intentional, to match with the behavior of range() 
itself, and with slices, etc.  Python is pretty consistent about using 
half-open intervals (where the lower bound is included but the upper 
bound is excluded), so I'm remaining consistent with the Python standard 
to maintain the principle of least surprise.

Jeff Shannon
Technician/Programmer
Credit International