[Tutor] Binary search question

Sat Apr 24 16:36:55 CEST 2010

On Sat, Apr 24, 2010 at 10:48 AM, Alan Gauld <alan.gauld at btinternet.com> wrote:
>
> "Steven D'Aprano" <steve at pearwood.info> wrote
>
>>> Interestingly on my PC with Python 3.1:
>>> >>> data = list(range(10000000))
>>> >>> 9999999 in data
>>>
>>> takes an apparently constant, sub second time.
>>
>> With respect Alan, timing operations by eye is pretty lame, what did you
>> do, count your heartbeats? :)
>
> Sure, but I was surprised at the previous post claiming 7 seconds.
> I just wanted to show that 'in' should not be discarded too quickly,
> its "good enough" for the vast majority of cases even with big data
> sets.
>

With regard to my previous post claiming a 7 second time, that was a
mistake on my part. I (conveniently ;-) forgot that the I ran a timing
of 10.000 lookups, not a single one. So yeah, my bad. 'in' actually
seems plenty fast for all but the most gargantuan data sets, and if
you're performing 10k lookups, building a set/frozenset is an
optimization that might be worth looking into (to phrase it
carefully).

So looking at my benchmarks again, it seems like the complexity bisect
adds is only very rarely worth the speedup (in fact, I start getting
memory errors before bisect makes enough of a difference to be worth
it).

Hugo