thank Chris.. <br><br><div class="gmail_quote">On Sat, May 5, 2012 at 2:39 PM, Chris Angelico <span dir="ltr"><<a href="mailto:rosuav@gmail.com" target="_blank">rosuav@gmail.com</a>></span> wrote:k<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On Sat, May 5, 2012 at 10:12 PM, J. Mwebaze <<a href="mailto:jmwebaze@gmail.com">jmwebaze@gmail.com</a>> wrote:<br>
> This is out of curiosity, i know this can be done with python diffllib<br>
> module, but been figuring out how to compute the delta, Consider two lists<br>
> below.<br>
><br>
> s1 = ['e', 'f', 'g', 'A', 'B', 'C', 'D', 'C']<br>
> s2 =['e', 'A', 'B', 'f', 'g', 'C', 'D', 'z']<br>
><br>
> This is the result should be<br>
><br>
> [' e', '+ A', '+ B', ' f', ' g', '- A', '- B', ' C', ' D', '- C', '+<br>
> z']<br>
><br>
> ideas on how to approach this.. ?<br>
<br>
</div>Here's a simple algorithm that produces sorta-mostly-reasonable<br>
results most of the time:<br>
<br>
Set your current-position-index to the beginning of each lists. (Call<br>
them 'pos1' and 'pos2'.)<br>
If s1[pos1] and s2[pos2] are identical, match - increment and iterate.<br>
Otherwise, increment pos2 until either it matches pos1 or you reach<br>
the end of s2.<br>
If you find a match, report the insertion into s2, increment both<br>
pointers past the match, and carry on.<br>
If you hit the end of s2, increment pos1 once and report an insertion<br>
into s1, then go back to looking for a match.<br>
<br>
It helps to append a sentinel to each list; that way, you don't need<br>
to separately check for additional content at the end of either list<br>
(as the sentinel won't match any of the strings).<br>
<br>
This is the algorithm I used for writing a simple file diff tool ages<br>
ago. It's not as good as diff(1), but it was fun to do the exercise.<br>
It's quite inefficient at handling large insertions into s1, and needs<br>
to be modified for most file handling (for instance, requiring a<br>
2-line or 3-line rematch after a difference, to avoid matching on<br>
blank lines), but it's a basis.<br>
<br>
Producing beautiful or minimal diffs is a more complex task. :)<br>
<br>
ChrisA<br>
<span class="HOEnZb"><font color="#888888">--<br>
<a href="http://mail.python.org/mailman/listinfo/python-list" target="_blank">http://mail.python.org/mailman/listinfo/python-list</a><br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><font style="color:rgb(0,51,0)" size="1"><b>Mob UG: +256 (0) 70 1735800 | NL +31 (0) 6 852 841 38 | Gtalk: jmwebaze | skype: mwebazej | URL: <a href="http://www.astro.rug.nl/~jmwebaze" target="_blank">www.astro.rug.nl/~jmwebaze</a><br>
<br>/* Life runs on code */</b></font><br><br>