[Tutor] List comprehensions to search a list--amazing!

Thu Mar 19 06:10:30 CET 2015

On 03/19/2015 12:20 AM, boB Stepp wrote:
> I hope extolling the beauty and power of Python on this list is
> allowed, because I have had a large "WOW!!!" moment tonight. I had a
> problem I was working on at work this afternoon. I have a list of ~
> 10,000 floating point numbers, which run from largest to smallest.
> There are duplicates scattered throughout, so I might have something
> like: [5942.6789, 5942.6789, 5941.000003, 5941.01, 5941.01, ... ],
> etc. I wanted to search the list for a test value, which, depending on
> the particular list (I can have many such lists, each different from
> the other.), could conceivably be anywhere within the given list. I
> needed to return the index where the list values change from being
> just greater than the test value to just less than the test value at
> the very next index position. I spent a good chunk of my afternoon
> writing a binary search function and wondering what theoretically the
> optimum search algorithm would be, got interrupted (as usual on this
> project), and decided to look at my books at home to see if a better
> solution would be staring at me from some book (Like there usually
> is!).
>
> I haven't studied list comprehensions formally yet, but a snippet of
> code in a book caught my eye where the author was discussing filtering
> data in a list. This led me to try:
>
> The generalized problem:
>
> L = [V0, V1, ..., Vn], where V0 >= V1 >= V2 >= ... >= Vn .
> Find index i, such that V[i] >= Vt >= V[i + 1], where Vt is the test
> value being searched for. I need to know the indices i and i + 1,
> which I need to interpolate based on where Vt falls.
>
> The solution (As a sublist, S)  I worked out tonight after
> experimenting with comprehension syntax is:
> S = [i for i, V in enumerate(L) if L[i] >= Vt >= L[i + 1]]
>
> And, of course, the index i I need is:
> i = S[0]
>
> I tested this out with concrete examples in the interpreter, such as
> with a list, L:
>
> L = [item for item in range(10000, 0, -1)]
>
> and trying different test values. It was blazingly fast, too!
>
> All I can say is: WOW!!!
>
>

That's very innovative.

The catch to a list comprehension is it has to visit all the elements, 
while a binary search would visit log-base-2 of them.  So instead of 
10000 elements, you'd be searching about 14 items.

For large lists, it'd probably be much quicker to use the bisect module.
https://docs.python.org/3.4/library/bisect.html

Check out bisect.bisect_left() and bisect.bisect_right()

I don't see how to directly use those functions on a list which is 
reverse-sorted, but the source is available.  On my install, it's 
located at:

/usr/lib/python3.4/bisect.py

-- 
DaveA