Merging multiple sorted sequences.
Erik
python at lucidity.plus.com
Wed Apr 12 18:15:45 EDT 2017
Hi.
I need to be able to lazily merge a variable number of already-sorted(*)
variable-length sequences into a single sorted sequence. The merge
should continue until the longest sequence has been exhausted.
(*) They may in practice be a lazy source of data known to only ever be
generated in an order that's "sorted" from the POV of this function.
I have the following - can anyone think of any improvements (other than
indentation style on the try/excepts ;) )? This seems like the sort of
thing for which there might be a built-in or recipe that I've missed.
-----------------------------------------
def merge(*sources):
items = []
for source in (iter(s) for s in sources):
try: items.append([next(source), source])
except StopIteration: pass
while len(items) > 1:
items.sort(key=lambda item: item[0])
lowest = items[0]
yield lowest[0]
try: lowest[0] = next(lowest[1])
except StopIteration: del items[0]
if len(items) != 0:
yield items[0][0]
yield from items[0][1]
-----------------------------------------
Test code:
-----------------------------------------
a = range(10)
b = range(4, 50, 3)
c = range(20, 100, 5)
# Greedy version to verify result:
greedy = list(a) + list(b) + list(c)
greedy.sort()
# Test multiple sequences:
assert list(merge(a, b, c)) == greedy
# Test a single sequence:
assert list(merge(greedy)) == list(merge(a, b, c))
# Test no sequences:
assert list(merge()) == []
assert list(merge([], (), range(0))) == []
-----------------------------------------
Thanks, E.
More information about the Python-list
mailing list