efficient interval containment lookup

Per Freem perfreem at yahoo.com
Mon Jan 12 20:36:47 CET 2009


suppose I have two lists of intervals, one significantly larger than
the other.
For example listA = [(10, 30), (5, 25), (100, 200), ...] might contain
of elements while listB (of the same form) might contain hundreds of
or millions of elements.
I want to count how many intervals in listB are contained within every
listA. For example, if listA = [(10, 30), (600, 800)] and listB =
[(20, 25), (12, 18)] is the input, then the output should be that (10,
30) has 2 intervals from listB contained within it, while (600, 800)
has 0. (Elements of listB can be contained within many intervals in
listA, not just one.)

What is an efficient way to this?  One simple way is:

for a_range in listA:
  for b_range in listB:
    is_within(b_range, a_range):
      # accumulate a counter here

where is_within simply checks if the first argument is within the

I'm not sure if it's more efficient to have the iteration over listA
be on the outside or listB.  But perhaps there's a way to index this
that makes things more efficient?  I.e. a smart way of indexing listA
such that I can instantly get all of its elements that are within some
of listB, maybe?  Something like a hash, where this look up can be
close to constant time rather than an iteration over all lists... if
there's any built-in library functions that can help in this it would
be great.

any suggestions on this would be awesome. thank you.

More information about the Python-list mailing list