[Tutor] Can I speed this up?
Kent Johnson
kent_johnson at skillsoft.com
Fri Oct 15 03:49:52 CEST 2004
Well sure, if you just try the easy ones!
How about this?
5,000,000,000,000,001,028 = 2*2*854908727*1462144391 10 minutes, 10 seconds
5,000,000,000,000,005,171 = 800104651*6249182521 11 minutes, 12 seconds
And my machine is a little slower than yours, I think...
BTW I majored in math a long time ago too :-)
Programming is much more fun.
Kent
At 06:07 PM 10/14/2004 -0700, Dick Moores wrote:
>Kent,
>
>Fascinating stuff. Not sure I understand it all--in fact I'm sure I don't.
>
>Of course the first thing I did was to compare times of yours with my
>pre-psyco version,
><http://www.rcblue.com/Python/factorIntegers-forWeb4.py>
>
>Both did
>5,000,000,000,000,003,954 = 5000000000000003954 = 2*17*37*3974562798092213
>in 34 seconds.
>Both did
>400,000,000,139,227 = 400000000139227 = 14492981*27599567 in 8 secs.
>Both did
>400,000,000,139,141 = 400000000139141 (PRIME) in 11 secs.
>
>Of course your code is much nicer, and I'll learn from it.
>
>I'd tried the lastFactor idea earlier today, but couldn't get it to work.
>It's interesting that you found it didn't really help, though I was sure
>it would.
>
>And thanks again for psyco!!
>
>BTW Working on this has gotten me interested in primes and their
>distribution. You'd never guess it, but I majored in math (without any
>interest in number theory) a long time ago.
>
>So maybe I'll try to program some of those other prime number algorithms
>you pointed me towards. Or not. Probably I should move on to another area
>in Python. Still so much to learn.
>
>Dick
>
>Kent Johnson wrote at 15:56 10/14/2004:
>>Most likely the best way to speed this up would be to use a better
>>algorithm. But anyway it's fun to play with optimizations, and I find it
>>interesting to learn where the bottlenecks are and what can be done about
>>them. Here is my version of factorsOfInteger(). I'll show the code first,
>>then talk about what I have learned...
>>
>>def findFactor(n, startingAt):
>> """
>> For n >= 4610000000000000000
>> """
>> limit = int(sqrt(n)) + 1
>> x = startingAt
>> while x < limit:
>> if not n % x:
>> return x
>> x += 2
>>
>> return 0
>>
>>
>>def factorsOfInteger(n):
>> factors = []
>>
>> # Get the factors of two out of the way to simplify findFactor
>> while n % 2 == 0:
>> factors.append(2)
>> n = n / 2
>>
>> lastFactor = 3
>> r = n
>> while True:
>> factor = findFactor(r, lastFactor)
>> if not factor:
>> # r is prime
>> factors.append(r)
>> break
>>
>> factors.append(factor)
>> lastFactor = factor
>> r = r / factor
>>
>> if n in factors:
>> return []
>> return factors
>>
>>
>>The first thing I did was to rewrite factorsOfInteger() to use a
>>factor-finding subroutine instead of isPrime. This way the program
>>becames basically
>>- look for a factor
>>- divide out the factor
>>- repeat until there are no more factors (what is left is prime)
>>
>>This rewrite gets rid of the problem the original program had of finding
>>the first factor of n twice - once in isPrime and once in the main loop.
>>
>>My first rewrite was a little simpler than what I have here. At first I
>>didn't keep track of lastFactor, I just started over from 2 every time.
>>Surprisingly, it doesn't make it much faster to keep track with
>>lastFactor. I think that is because with the sizes of numbers we are
>>using, once the first large factor has been found, the second
>>findFactor() is relatively quick even starting from 2.
>>
>>I put the handling for factors of 2 into the main function because
>>findFactor with the starting increment and the special case for 2 was too ugly.
>>
>>One thing I found out is that it doesn't make much difference to use
>>xrange() or an explicit loop. I tried it with timeit and it looks like
>>the while loop might actually be faster:
>>C:\Python23\Lib>python timeit.py -s "x = 3" "while x < 3000000: x += 2"
>>10 loops, best of 3: 2.71e+004 usec per loop
>>
>>C:\Python23\Lib>python timeit.py "for i in xrange(3, 3000000, 2): pass"
>>10 loops, best of 3: 1.03e+005 usec per loop
>>
>>So I took out the xrange() version and just used the while loop.
>>
>>One interesting thing is that the test
>> if not n % x:
>>is faster than
>> if n % x == 0:
>>
>>I found this out by looking at the disassembled byte codes of the
>>findFactor function. Here is how:
>> >>> import Primes
>> >>> import dis
>> >>> dis.dis(Primes.findFactor)
>> 62 0 LOAD_GLOBAL 0 (int)
>> 3 LOAD_GLOBAL 1 (sqrt)
>> 6 LOAD_FAST 0 (n)
>> 9 CALL_FUNCTION 1
>> 12 CALL_FUNCTION 1
>> 15 LOAD_CONST 1 (1)
>> 18 BINARY_ADD
>> 19 STORE_FAST 2 (limit)
>>
>> 63 22 LOAD_FAST 1 (startingAt)
>> 25 STORE_FAST 3 (x)
>>
>> 64 28 SETUP_LOOP 48 (to 79)
>> >> 31 LOAD_FAST 3 (x)
>> 34 LOAD_FAST 2 (limit)
>> 37 COMPARE_OP 0 (<)
>> 40 JUMP_IF_FALSE 34 (to 77)
>> 43 POP_TOP
>>
>> 65 44 LOAD_FAST 0 (n)
>> 47 LOAD_FAST 3 (x)
>> 50 BINARY_MODULO
>> 51 UNARY_NOT
>> 52 JUMP_IF_FALSE 8 (to 63)
>> 55 POP_TOP
>>
>> 66 56 LOAD_FAST 3 (x)
>> 59 RETURN_VALUE
>> 60 JUMP_FORWARD 1 (to 64)
>> >> 63 POP_TOP
>>
>> 67 >> 64 LOAD_FAST 3 (x)
>> 67 LOAD_CONST 2 (2)
>> 70 INPLACE_ADD
>> 71 STORE_FAST 3 (x)
>> 74 JUMP_ABSOLUTE 31
>> >> 77 POP_TOP
>> 78 POP_BLOCK
>>
>> 69 >> 79 LOAD_CONST 3 (0)
>> 82 RETURN_VALUE
>> 83 LOAD_CONST 4 (None)
>> 86 RETURN_VALUE
>>
>>The numbers on the far left ar line numbers. Line 65 is the if statement;
>>it takes two opcodes to load x and n, one for the modulo operation, one
>>for the not, then a test and jump. If the test is n %x = 0, the UNARY_NOT
>>is replaced by a load and a compare:
>>
>>65 44 LOAD_FAST 0 (n)
>> 47 LOAD_FAST 3 (x)
>> 50 BINARY_MODULO
>> 51 LOAD_CONST 2 (2)
>> 54 COMPARE_OP 2 (==)
>> 57 JUMP_IF_FALSE 8 (to 68)
>> 60 POP_TOP
>>
>>So using not n % x saves an opcode and makes a slight improvement in the
>>execution time.
>>
>>I tried using Raymond Hettinger's Bind Constants recipe
>>(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940) but it
>>didn't help, the only constants are int and sqrt, and they are only
>>called once each.
>>
>>So, short of learning Pyrex and compiling this to C, or rewriting it
>>directly in byte codes (there are an annoying number of loads and stores
>>of x in the main loop) I think this is as fast as I'm going to get with
>>this algorithm.
>>
>>Kent
>>
>>At 02:44 PM 10/14/2004 -0400, Kent Johnson wrote:
>>>The import psyco can be anywhere. The psyco.bind(factorsOfInteger) has
>>>to come after the definition of factorsOfInteger or you will get a NameError.
>>>
>>>I've been monkeying around with a version of this, I'll post it when I
>>>get time to write it up...the key for me was instead of isPrime() I have
>>>find findFactor()
>>>
>>>Kent
>>>
>>>At 11:34 AM 10/14/2004 -0700, Dick Moores wrote:
>>>>Kent,
>>>>
>>>>Thanks very much. Got psyco working, and it makes a significant
>>>>difference. The worst "anomalie" I'd found in the 400 trillion range was
>>>>400,000,000,092,821 = 19624679*20382499.
>>>>After implementing a couple of your previous suggestions the time for
>>>>this went
>>>>from 41 to 18 seconds. And now with psyco, to 13 seconds!
>>>>
>>>><http://www.rcblue.com/Python/factorIntegers-forWeb-WithPsyco4.py>
>>>>Version I first asked about is still at
>>>><http://www.rcblue.com/Python/factorIntegers-forWeb.py>
>>>>
>>>>I'm still trying to re-rethink isPrime(), isPrimeSmall(), isPrimeBig()
>>>>and factorsOfInteger().
>>>>
>>>>BTW I'm wondering why you said to put
>>>>
>>>>import psyco
>>>>psyco.bind(factorsOfInteger)
>>>>
>>>>after the definition of factorsOfInteger(). I have "import time" at the
>>>>top, above all the functions. Is this wrong (it works there) or non-standard?
>>>>
>>>>Dick
>>>>
>>>>
>>>>Kent Johnson wrote at 08:15 10/14/2004:
>>>>>Unzip the zip file. Copy the folder psyco-1.2/psyco into
>>>>>Python23/Lib/site-packages. (Create site-packages if you don't already
>>>>>have it.) Should be good to go then.
>>>>>
>>>>>Kent
>>>>>
>>>>>At 07:38 AM 10/14/2004 -0700, Dick Moores wrote:
>>>>>>Kent Johnson wrote at 05:32 10/12/2004:
>>>>>>>Using psyco gives a small speedup - in my limited testing, about 15%.
>>>>>>>
>>>>>>>Install psyco from http://psyco.sourceforge.net, then put these two
>>>>>>>lines after the definition of factorsOfInteger:
>>>>>>>import psyco
>>>>>>>psyco.bind(factorsOfInteger)
>>>>>>
>>>>>>Sorry to be so dumb about these things, but I can't figure out how to
>>>>>>install Psyco. I now have psyco-1.2-win-2.3.zip in my C:\Python23
>>>>>>folder (Win XP). I've poked around in the file but I don't see any
>>>>>>instructions as to what to next to install Psyco. Also,
>>>>>><http://psyco.sourceforge.net/psycoguide/index.html> doesn't help this dummy.
>>>>>>
>>>>>>Help, please.
>>>>>>
>>>>>>Dick Moores
>>>>>>rdm at rcblue.com
>>>
>>>__
>
>
More information about the Tutor
mailing list