efficient interval containment lookup

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Mon Jan 12 23:00:35 CET 2009


On Mon, 12 Jan 2009 11:36:47 -0800, Per Freem wrote:

> hello,
> 
> suppose I have two lists of intervals, one significantly larger than the
> other.
[...]
> What is an efficient way to this?  One simple way is:

http://en.wikipedia.org/wiki/Interval_tree


> for a_range in listA:
>   for b_range in listB:
>     is_within(b_range, a_range):
>       # accumulate a counter here
> 
> where is_within simply checks if the first argument is within the
> second.

This will be O(m*n) where m is the size of the smaller list and n is the 
size of the big one. It shouldn't matter which order you do the loops, 
you still have to iterate over every point in each list, for every point 
in the other list.

Using an interval tree may let you decrease the time to O(m*log n), at 
the cost of extra overhead, but for millions of items the saving should 
be large.


-- 
Steven



More information about the Python-list mailing list