Overlap in python

Mark Dickinson dickinsm at gmail.com
Tue Aug 4 22:03:00 CEST 2009


On Aug 4, 7:15 pm, Jay Bird <jay.bird0... at gmail.com> wrote:
> Hi everyone,
>
> I've been trying to figure out a simple algorithm on how to combine a
> list of parts that have 1D locations that overlap into a non-
> overlapping list.  For example, here would be my input:
>
> part name   location
> a                  5-9
> b                  7-10
> c                  3-6
> d                  15-20
> e                  18-23
>
> And here is what I need for an output:
> part name   location
> c.a.b            3-10
> d.e               15-23
>
> I've tried various methods, which all fail.  Does anyone have an idea
> how to do this?

Here's an approach that might work.  Turning it into a sensible
function and parsing input/massaging output properly are left
as an exercise.  Blocks that just touch (e.g. (3, 6) then (6, 9))
are amalgamated;  if this isn't what you want, and they should
be treated as separate instead, then replace 'Stop' with 'Finish'
(or any other string that sorts before 'Start').


input = [
    ('a', (5, 9)),
    ('b', (7, 10)),
    ('c', (3, 6)),
    ('d', (15, 20)),
    ('e', (18, 23)),
    ]

# create sorted list of points where an interval is entered or left
transitions = []
for name, (start, stop) in input:
    transitions.extend([(start, 'Start', name), (stop, 'Stop', name)])
transitions.sort()

# scan list, accumulating blocks along the way.
count = 0
for i, (pt, startend, name) in enumerate(transitions):
    if startend == 'Start':
        if not count:
            start = pt
            names = []
        count += 1
        names.append(name)
    else:
        count -= 1
        if not count:
            print '.'.join(names), "%s--%s" % (start, pt)



The output that I get from this under Python 2.6 is:

c.a.b 3--10
d.e 15--23


--
Mark



More information about the Python-list mailing list