Merging ordered lists

Tue Jun 3 13:35:42 EDT 2008

On Jun 2, 11:08 pm, Raymond Hettinger <pyt... at rcn.com> wrote:
>
> If the inputs were not sorted, then I don't think you have a precise
> idea of what it means to merge them while preserving order.   For
> example if the inputs are XYZPDQ and bYlmPz, then what does a merged
> sequence look like once the Y and P duplicates are removed? Is it
> XZPDQblmz or some intermix of the two like XbZlPmDzQ?  If it is the
> first of those, then the answer is simple:
>

I was looking at Bram Cohen's description of his diff algorithm,
implemented in Bazaar:
http://bramcohen.livejournal.com/37690.html
"Instead of doing a longest common subsequence on everything, it does
a longest common subsequence on lines which occur exactly once on both
sides, then recurses between lines which got matched on that pass."

So, if sources looks like:
[list("XYZPDQ"), list("bYlmPz")]

Then the proper merge would use Y and P as the delimiters, putting X
and b before Y; Z, l and m before P; and D, Q and z after P -- like
your second example (but keeping an instance of Y). Right? Leaving us
with the context-dependent issue of ordering between the delimiters,
probably depending on which list is used as the reference initially.
So, if the reference list's elements go first, the result would be
XbYZlmPDQz. If the elements from later sources go after the last
common element (my first approach), it's bXYlmZPzDQ.

However, in the cold light of morning it looks like that was the wrong
approach -- it makes sense to treat any data that doesn't occur in the
reference list as semi-obsolete and append it instead -- so I think
I'll use your much simpler algorithm. Thanks!