[Tutor] Prime Factorization Tool

Steven D'Aprano steve at pearwood.info
Thu Dec 1 23:35:07 CET 2011

Wayne Werner wrote:
> Well, there are really only a couple of optimizations that you could make.
> That's the nice (bad?) thing about primes - you really only *can* brute
> force a solution. That's why nice things like encryption exist.

Brute force is a little strong, but not far from the mark. For really large 
primes, hundreds or thousands of digits long, it simply isn't computationally 
feasible to factorize numbers quickly.

But for small numbers, there's a huge difference in speed between the most 
naive brute force methods and the more clever methods. I have a set of 
functions for generating prime numbers, which can be used to factorize larger 
numbers. I benchmarked them by seeing how many primes they could find in five 
minutes. Here are the results on my computer:

Benchmarking primes calculated in 420 seconds...
croft         : 12856761 primes =>  29944 per second
sieve         :  9380437 primes =>  22334 per second
wheel         :  1206656 primes =>   2873 per second
trial_division:   646429 primes =>   1539 per second
turner        :    27896 primes =>     66 per second
naive_primes  :    11964 primes =>     28 per second
wheel2        :    24148 primes =>     57 per second

The fastest method, croft, is over 1000 times faster than the slowest.

> The less obvious optimization is in reference to primes - you don't
> actually have to check all the way up to N. Or even N/2. You only have to
> check numbers up to the square root of N.

Absolutely. This is perhaps the most important optimization you can apply for 
factorizing small numbers. If you check every number, *at least* 99.9% of the 
work you do is unnecessary. The difference between trying to factorize (say) 
100034567 by checking every number, and by checking up to the square root, is 
significant: you literally avoid 99.99% of the work.

> That will easily decrease the running time of your program because now
> instead of dividing N numbers you're dividing sqrt(N) numbers.
> N/2 is the largest possible factor. Any number larger than N/2 is somewhere
> between N/2 and N/1, and since there's no whole number between 2 and 1 you
> know that the largest number you need to check for a factor is N/2, or in
> the case of 100/2 = 50.

But since N/2 is larger than sqrt(N), you don't even need to go that high.

The secret is that as you find each factor, you reduce the number you are 
testing. For example, to factorize 2166, divide by 2, then every odd number up 
to sqrt(2166)=46, stopping when the number reaches 1 (if it does).

2 goes into 2166, so 2 is a factor and leaving 1083
3 goes into 1083, so 3 is a factor and leaving 361
5 doesn't go into 361
7 doesn't go into 361
9 doesn't go into 361
19 goes into 361 twice, so 19 is a repeated factor

and we are done because the number is reduced to 1 and there are no more 
factors: 2166 = 2*3*19*19 exactly.


More information about the Tutor mailing list