# Merging sorted lists/iterators/generators into one stream of values...

Lasse Vågsæther Karlsen lasse at vkarlsen.no
Tue Oct 11 13:12:06 CEST 2005

```<snip>

Another idea for this method would be that in some cases I noticed that
it was useful to know which source each element would come from as well,
as well as removing duplicates from the results.

For instance

s1 = [1, 3, 5, 7]
s2 = [2, 3, 4]

for k, s in merge_by_sort(s1, s2):
print k, "from source", s

this would print:

1 from source 0
2 from source 1
3 from source 1
3 from source 0
4 from source 1
5 from source 0
7 from source 0

and the above list has 3 twice, so possibly:

1 from sources [0]
2 from sources [1]
3 from sources [0, 1]
4 from sources [1]
5 from sources [0]
7 from sources [0]

This latter one would be a slightly more heavy method as it would have
to compare the N first elements of the list or heap to figure out what
indices to yield as well.

However, the previous solution could be:

def merge_by_sort(*sources, **options):
if "cmp" in options:
comparison = options["cmp"]
else:
comparison = cmp

iterables = []
for index, source in enumerate(sources):
try:
source = iter(source)
iterables.append([source.next(), index, source])
except StopIteration:
pass

iterables.sort(cmp=comparison, key=lambda x: x[0], reverse=True)
while iterables:
yield iterables[-1][0], iterables[-1][1]
try:
iterables[-1][0] = iterables[-1][2].next()
if len(iterables) > 1 and comparison(iterables[-1][0],
iterables[-2][0]) > 0:
iterables.sort(comparison, key=lambda x: x[0],
reverse=True)
except StopIteration:
iterables.pop(-1)

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:lasse at vkarlsen.no
PGP KeyID: 0x2A42A1C2

```