Speed up this code?
Tim Chase
python.list at tim.thechases.com
Thu May 25 21:24:13 EDT 2006
> def rmlist(original, deletions):
> return [i for i in original if i not in deletions]
>
> original will be a list of odd numbers and deletions will be numbers
> that are not prime, thus this code will return all items in original
> that are not in deletions. For n > 100,000 or so, the program takes a
> very long time to run, whereas it's fine for numbers up to 10,000.
>
> Does anybody know a faster way to do this? (finding the difference all
> items in list a that are not in list b)?
Testing for membership in an unsorted list is an O(n) sort of
operation...the larger the list, the longer it takes.
I presume order doesn't matter, or that the results can be sorted
after the fact. If this is the case, it's quite efficient to use
sets which provide intersection/difference/union methods. If you
pass in sets rather than lists, you can simply
return original.difference(deletions)
It's almost not worth calling a function for :) There's also an
in-place version called difference_update().
Once you've found all the results you want, and done all the set
differences you want, you can just pass the resulting set to a
list and sort it, if sorted results matter.
-tkc
More information about the Python-list
mailing list