Speed up this code?
Ben Cartwright
bencvt at gmail.com
Thu May 25 21:24:03 EDT 2006
aomighty at gmail.com wrote:
> I'm creating a program to calculate all primes numbers in a range of 0
> to n, where n is whatever the user wants it to be. I've worked out the
> algorithm and it works perfectly and is pretty fast, but the one thing
> seriously slowing down the program is the following code:
>
> def rmlist(original, deletions):
> return [i for i in original if i not in deletions]
>
> original will be a list of odd numbers and deletions will be numbers
> that are not prime, thus this code will return all items in original
> that are not in deletions. For n > 100,000 or so, the program takes a
> very long time to run, whereas it's fine for numbers up to 10,000.
>
> Does anybody know a faster way to do this? (finding the difference all
> items in list a that are not in list b)?
The "in" operator is expensive for lists because Python has to check,
on average, half the items in the list. Use a better data structure...
in this case, a set will do nicely. See the docs:
http://docs.python.org/lib/types-set.html
http://docs.python.org/tut/node7.html#SECTION007400000000000000000
Oh, and you didn't ask for it, but I'm sure you're going to get a dozen
pet implementations of prime generators from other c.l.py'ers. So
here's mine. :-)
def primes():
"""Generate prime numbers using the sieve of Eratosthenes."""
yield 2
marks = {}
cur = 3
while True:
skip = marks.pop(cur, None)
if skip is None:
# unmarked number must be prime
yield cur
# mark ahead
marks[cur*cur] = 2*cur
else:
n = cur + skip
while n in marks:
# x already marked as multiple of another prime
n += skip
# first unmarked multiple of this prime
marks[n] = skip
cur += 2
--Ben
More information about the Python-list
mailing list