[Tutor] "bisect" vs "in"
Albert-Jan Roskam
fomcl at yahoo.com
Thu Sep 20 13:58:31 CEST 2012
Hi,
I want to repeatedly search a list to test whether that list contains a given element X.
Of course, I can use membership testing using "in" (list._contains__).
But how is this method implemented? It seems to just walk through the whole list from
beginning to end (see timeit code below). Is it faster to do a binary search on a sorted list,
given that this task has to be done repeatedly? Can such a program be scaled to lists
that do not fit into memory (maybe using shelve?)?
This is the function I made:
import bisect
def binarySearch(haystack, needle):
return haystack[bisect.bisect_right(haystack, needle) - 1] == needle
# needle is at the beginning of the sequence
>>> t = timeit.Timer("""import bisect
def binarySearch(haystack, needle):
return haystack[bisect.bisect_right(haystack, needle) - 1] == needle
binarySearch(range(10**7), 0)""")
>>> t.timeit(100)
16.737915573068676
>>> t2 = timeit.Timer("0 in range(10**7)")
>>> t2.timeit(100)
16.304621956142682
>>>
# needle is at the end of the sequence
>>> t = timeit.Timer("""import bisect
def binarySearch(haystack, needle):
return haystack[bisect.bisect_right(haystack, needle) - 1] == needle
binarySearch(range(10**7), 10**7-1)""")
>>> t.timeit(100)
16.954997632380582
>>> t2 = timeit.Timer("10**7-1 in range(10**7)")
>>> t2.timeit(100)
36.3686376341127
Thank you in advance!
Regards,
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Tutor
mailing list