[Tutor] Can I speed this up?
kent_johnson at skillsoft.com
Fri Oct 15 00:56:01 CEST 2004
Most likely the best way to speed this up would be to use a better
algorithm. But anyway it's fun to play with optimizations, and I find it
interesting to learn where the bottlenecks are and what can be done about
them. Here is my version of factorsOfInteger(). I'll show the code first,
then talk about what I have learned...
def findFactor(n, startingAt):
For n >= 4610000000000000000
limit = int(sqrt(n)) + 1
x = startingAt
while x < limit:
if not n % x:
x += 2
factors = 
# Get the factors of two out of the way to simplify findFactor
while n % 2 == 0:
n = n / 2
lastFactor = 3
r = n
factor = findFactor(r, lastFactor)
if not factor:
# r is prime
lastFactor = factor
r = r / factor
if n in factors:
The first thing I did was to rewrite factorsOfInteger() to use a
factor-finding subroutine instead of isPrime. This way the program becames
- look for a factor
- divide out the factor
- repeat until there are no more factors (what is left is prime)
This rewrite gets rid of the problem the original program had of finding
the first factor of n twice - once in isPrime and once in the main loop.
My first rewrite was a little simpler than what I have here. At first I
didn't keep track of lastFactor, I just started over from 2 every time.
Surprisingly, it doesn't make it much faster to keep track with lastFactor.
I think that is because with the sizes of numbers we are using, once the
first large factor has been found, the second findFactor() is relatively
quick even starting from 2.
I put the handling for factors of 2 into the main function because
findFactor with the starting increment and the special case for 2 was too ugly.
One thing I found out is that it doesn't make much difference to use
xrange() or an explicit loop. I tried it with timeit and it looks like the
while loop might actually be faster:
C:\Python23\Lib>python timeit.py -s "x = 3" "while x < 3000000: x += 2"
10 loops, best of 3: 2.71e+004 usec per loop
C:\Python23\Lib>python timeit.py "for i in xrange(3, 3000000, 2): pass"
10 loops, best of 3: 1.03e+005 usec per loop
So I took out the xrange() version and just used the while loop.
One interesting thing is that the test
if not n % x:
is faster than
if n % x == 0:
I found this out by looking at the disassembled byte codes of the
findFactor function. Here is how:
>>> import Primes
>>> import dis
62 0 LOAD_GLOBAL 0 (int)
3 LOAD_GLOBAL 1 (sqrt)
6 LOAD_FAST 0 (n)
9 CALL_FUNCTION 1
12 CALL_FUNCTION 1
15 LOAD_CONST 1 (1)
19 STORE_FAST 2 (limit)
63 22 LOAD_FAST 1 (startingAt)
25 STORE_FAST 3 (x)
64 28 SETUP_LOOP 48 (to 79)
>> 31 LOAD_FAST 3 (x)
34 LOAD_FAST 2 (limit)
37 COMPARE_OP 0 (<)
40 JUMP_IF_FALSE 34 (to 77)
65 44 LOAD_FAST 0 (n)
47 LOAD_FAST 3 (x)
52 JUMP_IF_FALSE 8 (to 63)
66 56 LOAD_FAST 3 (x)
60 JUMP_FORWARD 1 (to 64)
>> 63 POP_TOP
67 >> 64 LOAD_FAST 3 (x)
67 LOAD_CONST 2 (2)
71 STORE_FAST 3 (x)
74 JUMP_ABSOLUTE 31
>> 77 POP_TOP
69 >> 79 LOAD_CONST 3 (0)
83 LOAD_CONST 4 (None)
The numbers on the far left ar line numbers. Line 65 is the if statement;
it takes two opcodes to load x and n, one for the modulo operation, one for
the not, then a test and jump. If the test is n %x = 0, the UNARY_NOT is
replaced by a load and a compare:
65 44 LOAD_FAST 0 (n)
47 LOAD_FAST 3 (x)
51 LOAD_CONST 2 (2)
54 COMPARE_OP 2 (==)
57 JUMP_IF_FALSE 8 (to 68)
So using not n % x saves an opcode and makes a slight improvement in the
I tried using Raymond Hettinger's Bind Constants recipe
(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940) but it
didn't help, the only constants are int and sqrt, and they are only called
So, short of learning Pyrex and compiling this to C, or rewriting it
directly in byte codes (there are an annoying number of loads and stores of
x in the main loop) I think this is as fast as I'm going to get with this
At 02:44 PM 10/14/2004 -0400, Kent Johnson wrote:
>The import psyco can be anywhere. The psyco.bind(factorsOfInteger) has to
>come after the definition of factorsOfInteger or you will get a NameError.
>I've been monkeying around with a version of this, I'll post it when I get
>time to write it up...the key for me was instead of isPrime() I have find
>At 11:34 AM 10/14/2004 -0700, Dick Moores wrote:
>>Thanks very much. Got psyco working, and it makes a significant
>>difference. The worst "anomalie" I'd found in the 400 trillion range was
>>400,000,000,092,821 = 19624679*20382499.
>>After implementing a couple of your previous suggestions the time for
>>from 41 to 18 seconds. And now with psyco, to 13 seconds!
>>Version I first asked about is still at
>>I'm still trying to re-rethink isPrime(), isPrimeSmall(), isPrimeBig()
>>BTW I'm wondering why you said to put
>>after the definition of factorsOfInteger(). I have "import time" at the
>>top, above all the functions. Is this wrong (it works there) or non-standard?
>>Kent Johnson wrote at 08:15 10/14/2004:
>>>Unzip the zip file. Copy the folder psyco-1.2/psyco into
>>>Python23/Lib/site-packages. (Create site-packages if you don't already
>>>have it.) Should be good to go then.
>>>At 07:38 AM 10/14/2004 -0700, Dick Moores wrote:
>>>>Kent Johnson wrote at 05:32 10/12/2004:
>>>>>Using psyco gives a small speedup - in my limited testing, about 15%.
>>>>>Install psyco from http://psyco.sourceforge.net, then put these two
>>>>>lines after the definition of factorsOfInteger:
>>>>Sorry to be so dumb about these things, but I can't figure out how to
>>>>install Psyco. I now have psyco-1.2-win-2.3.zip in my C:\Python23
>>>>folder (Win XP). I've poked around in the file but I don't see any
>>>>instructions as to what to next to install Psyco. Also,
>>>><http://psyco.sourceforge.net/psycoguide/index.html> doesn't help this dummy.
>>>>rdm at rcblue.com
>Tutor maillist - Tutor at python.org
More information about the Tutor