Prime number module
Alex Martelli
aleax at aleax.it
Tue Sep 30 09:47:15 EDT 2003
Tim Hochberg wrote:
...
> I believe you could implement a hybrid scheme that would be quite fast
> and still maintain nearly the same level of compression that you
> describe above. In addition to the above compressed data, also store,
> uncompressed, every Nth prime. A binary search will get you within N
> primes of your answer, to find the exact value, recreate those N-primes.
> For a N of, for instance, 64 the level of compression would be minimally
> affected but should make finding the number of primes less than a given
> number number much faster than the basic compressed scheme.
Still thinking of gmpy's primitives, I came up with a variant of this...:
import gmpy
import array
import bisect
highest_number_of_interest = gmpy.mpz(100*1000*1000)
def make_primes(put_them_here, every=1000):
current_prime = gmpy.mpz(put_them_here[-1])
count = 0
while current_prime < highest_number_of_interest:
current_prime = current_prime.next_prime()
count += 1
if count == every:
put_them_here.append(int(current_prime))
count = 0
try:
savefile = open('primes.dat', 'rb')
except:
every_1000th_prime = array.array('l', [1])
make_primes(every_1000th_prime)
savefile = open('primes.dat', 'wb')
every_1000th_prime.tofile(savefile)
savefile.close()
else:
every_1000th_prime = array.array('l', [])
try: every_1000th_prime.fromfile(savefile, 6000)
except EOFError: pass
else:
warnings.warn("Only %d primes (?)" % len(every_1000th_prime))
warnings.warn("Highest is %d" % every_1000th_prime[-1])
# could fill out and re-save the array here, if needed
savefile.close()
print "%d primes stored (1 every 1000), highest is %d" % (
len(every_1000th_prime), every_1000th_prime[-1])
def nth_prime(N):
N_div_1000, N_mod_1000 = divmod(N, 1000)
try: prime = every_1000th_prime[N_div_1000]
except IndexError:
raise ValueError, "must be N<%d" %
(1000*len(every_1000th_prime)+999)
prime = gmpy.mpz(prime)
for x in xrange(N_mod_1000):
prime = prime.next_prime()
return int(prime)
print "55555th prime is", nth_prime(55555)
import bisect
def primes_upto(N):
if N>highest_number_of_interest:
raise ValueError, "must be N<%d" % highest_number_of_interest
x = bisect.bisect_left(every_1000th_prime, N)
if N == every_1000th_prime[x]: return x*1000+1
prime = gmpy.mpz(every_1000th_prime[x-1])
x = (x-1)*1000
while prime < N:
x += 1
prime = prime.next_prime()
if prime == N: x += 1
return x
print "Up to 555555, there are %d primes" % primes_upto(555555)
The first run, building the file primes.dat (just 23048 bytes), took
almost 7 minutes elapsed time on my machine (250 CPU seconds). But
any successive run is extremely fast:
[alex at lancelot perex]$ time python -O prio.py
5762 primes stored (1 every 1000), highest is 99991609
55555th prime is 686671
Up to 555555, there are 45741 primes
0.06user 0.00system 0:00.14elapsed 41%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (494major+297minor)pagefaults 0swaps
[alex at lancelot perex]$
I've coded this rapidly and sloppily, so I wouldn't be surprised if
there were off-by-one errors here and there (needs to be checked against
some other, independent way to generate/check primes!), but I think the
fundamental ideas are sound.
Alex
More information about the Python-list
mailing list