Merging multiple sorted sequences.

Erik python at
Wed Apr 12 18:15:45 EDT 2017


I need to be able to lazily merge a variable number of already-sorted(*) 
variable-length sequences into a single sorted sequence. The merge 
should continue until the longest sequence has been exhausted.

(*) They may in practice be a lazy source of data known to only ever be 
generated in an order that's "sorted" from the POV of this function.

I have the following - can anyone think of any improvements (other than 
indentation style on the try/excepts ;) )? This seems like the sort of 
thing for which there might be a built-in or recipe that I've missed.

def merge(*sources):
     items = []
     for source in (iter(s) for s in sources):
        try: items.append([next(source), source])
        except StopIteration: pass

     while len(items) > 1:
         items.sort(key=lambda item: item[0])
         lowest = items[0]
         yield lowest[0]

         try: lowest[0] = next(lowest[1])
         except StopIteration: del items[0]

     if len(items) != 0:
        yield items[0][0]
        yield from items[0][1]

Test code:

a = range(10)
b = range(4, 50, 3)
c = range(20, 100, 5)

# Greedy version to verify result:
greedy = list(a) + list(b) + list(c)

# Test multiple sequences:
assert list(merge(a, b, c)) == greedy

# Test a single sequence:
assert list(merge(greedy)) == list(merge(a, b, c))

# Test no sequences:
assert list(merge()) == []
assert list(merge([], (), range(0))) == []

Thanks, E.

More information about the Python-list mailing list