[Tutor] Can I speed this up?

Kent Johnson kent_johnson at skillsoft.com
Fri Oct 15 00:56:01 CEST 2004

Most likely the best way to speed this up would be to use a better 
algorithm. But anyway it's fun to play with optimizations, and I find it 
interesting to learn where the bottlenecks are and what can be done about 
them. Here is my version of factorsOfInteger(). I'll show the code first, 
then talk about what I have learned...

def findFactor(n, startingAt):
     For n >= 4610000000000000000
     limit = int(sqrt(n)) + 1
     x = startingAt
     while x < limit:
         if not n % x:
             return x
         x += 2

     return 0

def factorsOfInteger(n):
     factors = []

     # Get the factors of two out of the way to simplify findFactor
     while n % 2 == 0:
         n = n / 2

     lastFactor = 3
     r = n
     while True:
         factor = findFactor(r, lastFactor)
         if not factor:
             # r is prime

         lastFactor = factor
         r = r / factor

     if n in factors:
         return []
     return factors

The first thing I did was to rewrite factorsOfInteger() to use a 
factor-finding subroutine instead of isPrime. This way the program becames 
- look for a factor
- divide out the factor
- repeat until there are no more factors (what is left is prime)

This rewrite gets rid of the problem the original program had of finding 
the first factor of n twice - once in isPrime and once in the main loop.

My first rewrite was a little simpler than what I have here. At first I 
didn't keep track of lastFactor, I just started over from 2 every time. 
Surprisingly, it doesn't make it much faster to keep track with lastFactor. 
I think that is because with the sizes of numbers we are using, once the 
first large factor has been found, the second findFactor() is relatively 
quick even starting from 2.

I put the handling for factors of 2 into the main function because 
findFactor with the starting increment and the special case for 2 was too ugly.

One thing I found out is that it doesn't make much difference to use 
xrange() or an explicit loop. I tried it with timeit and it looks like the 
while loop might actually be faster:
C:\Python23\Lib>python timeit.py -s "x = 3" "while x < 3000000: x += 2"
10 loops, best of 3: 2.71e+004 usec per loop

C:\Python23\Lib>python timeit.py  "for i in xrange(3, 3000000, 2): pass"
10 loops, best of 3: 1.03e+005 usec per loop

So I took out the xrange() version and just used the while loop.

One interesting thing is that the test
   if not n % x:
is faster than
   if n % x == 0:

I found this out by looking at the disassembled byte codes of the 
findFactor function. Here is how:
 >>> import Primes
 >>> import dis
 >>> dis.dis(Primes.findFactor)
  62           0 LOAD_GLOBAL              0 (int)
               3 LOAD_GLOBAL              1 (sqrt)
               6 LOAD_FAST                0 (n)
               9 CALL_FUNCTION            1
              12 CALL_FUNCTION            1
              15 LOAD_CONST               1 (1)
              18 BINARY_ADD
              19 STORE_FAST               2 (limit)

  63          22 LOAD_FAST                1 (startingAt)
              25 STORE_FAST               3 (x)

  64          28 SETUP_LOOP              48 (to 79)
         >>   31 LOAD_FAST                3 (x)
              34 LOAD_FAST                2 (limit)
              37 COMPARE_OP               0 (<)
              40 JUMP_IF_FALSE           34 (to 77)
              43 POP_TOP

  65          44 LOAD_FAST                0 (n)
              47 LOAD_FAST                3 (x)
              50 BINARY_MODULO
              51 UNARY_NOT
              52 JUMP_IF_FALSE            8 (to 63)
              55 POP_TOP

  66          56 LOAD_FAST                3 (x)
              59 RETURN_VALUE
              60 JUMP_FORWARD             1 (to 64)
         >>   63 POP_TOP

  67     >>   64 LOAD_FAST                3 (x)
              67 LOAD_CONST               2 (2)
              70 INPLACE_ADD
              71 STORE_FAST               3 (x)
              74 JUMP_ABSOLUTE           31
         >>   77 POP_TOP
              78 POP_BLOCK

  69     >>   79 LOAD_CONST               3 (0)
              82 RETURN_VALUE
              83 LOAD_CONST               4 (None)
              86 RETURN_VALUE

The numbers on the far left ar line numbers. Line 65 is the if statement; 
it takes two opcodes to load x and n, one for the modulo operation, one for 
the not, then a test and jump. If the test is n %x = 0, the UNARY_NOT is 
replaced by a load and a compare:

65          44 LOAD_FAST                0 (n)
             47 LOAD_FAST                3 (x)
             50 BINARY_MODULO
             51 LOAD_CONST               2 (2)
             54 COMPARE_OP               2 (==)
             57 JUMP_IF_FALSE            8 (to 68)
             60 POP_TOP

So using not n % x saves an opcode and makes a slight improvement in the 
execution time.

I tried using Raymond Hettinger's Bind Constants recipe 
(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940) but it 
didn't help, the only constants are int and sqrt, and they are only called 
once each.

So, short of learning Pyrex and compiling this to C, or rewriting it 
directly in byte codes (there are an annoying number of loads and stores of 
x in the main loop) I think this is as fast as I'm going to get with this 


At 02:44 PM 10/14/2004 -0400, Kent Johnson wrote:
>The import psyco can be anywhere. The psyco.bind(factorsOfInteger) has to 
>come after the definition of factorsOfInteger or you will get a NameError.
>I've been monkeying around with a version of this, I'll post it when I get 
>time to write it up...the key for me was instead of isPrime() I have find 
>At 11:34 AM 10/14/2004 -0700, Dick Moores wrote:
>>Thanks very much. Got psyco working, and it makes a significant 
>>difference. The worst "anomalie" I'd found in the 400 trillion range was
>>400,000,000,092,821 = 19624679*20382499.
>>After implementing a couple of your previous suggestions the time for 
>>this went
>>from 41 to 18 seconds. And now with psyco, to 13 seconds!
>>Version I first asked about is still at
>>I'm still trying to re-rethink isPrime(), isPrimeSmall(),  isPrimeBig() 
>>and factorsOfInteger().
>>BTW I'm wondering why you said to put
>>import psyco
>>after the definition of factorsOfInteger(). I have "import time" at the 
>>top, above all the functions. Is this wrong (it works there) or non-standard?
>>Kent Johnson wrote at 08:15 10/14/2004:
>>>Unzip the zip file. Copy the folder psyco-1.2/psyco into 
>>>Python23/Lib/site-packages. (Create site-packages if you don't already 
>>>have it.) Should be good to go then.
>>>At 07:38 AM 10/14/2004 -0700, Dick Moores wrote:
>>>>Kent Johnson wrote at 05:32 10/12/2004:
>>>>>Using psyco gives a small speedup - in my limited testing, about 15%.
>>>>>Install psyco from http://psyco.sourceforge.net, then put these two 
>>>>>lines after the definition of factorsOfInteger:
>>>>>import psyco
>>>>Sorry to be so dumb about these things, but I can't figure out how to 
>>>>install Psyco. I now have psyco-1.2-win-2.3.zip in my C:\Python23 
>>>>folder (Win XP). I've poked around in the file but I don't see any 
>>>>instructions as to what to next to install Psyco.  Also, 
>>>><http://psyco.sourceforge.net/psycoguide/index.html> doesn't help this dummy.
>>>>Help, please.
>>>>Dick Moores
>>>>rdm at rcblue.com
>Tutor maillist  -  Tutor at python.org

More information about the Tutor mailing list