[Tutor] Comparison Textboxes

Dave Angel davea at davea.name
Tue Aug 27 01:37:35 CEST 2013


On 26/8/2013 17:26, taserian wrote:


> <div dir="ltr"><div><div><div>I'm attempting to gather the pieces I need for a simple project I'd like to do for my job, but I'm having a difficult time finding something, and I'm appealing to the hive mind at Tutor for wisdom.<br>
> <br></div>My project needs to compare two or more large textboxes

Please post using text email, not html.  it's messy, more than doubles
most messages, and frequently has bugs in how it gets displayed.

First question is what environment this is.  What version of Python, and
what operating system?

Next question is what's the reason for the requirement.  Is it an
assignment, and must be done by hand, is it something for your own use? 
Is it an app that will be used by programmers or by people unfamiliar
with programming tools.

Next question is what will your user be needing the differences for? 
Will they be cutting and pasting from the difference file into the same
or other dialog boxes?  Are they interested in the "minimum" set of
differences, or is there a bias towards showing largish blocks, even if
not every line in the block is different?

Next is how it will be presented.  Some diff programs colorize the "new"
and "deleted" lines, and more or less ignore changes in order.  Others
produce something that could be applied to the one file to produce the
other (eg. a version control system).

Finally, is this a requirement for the intellectual stimulation of
coding it from scratch?  Are you curious about the possible algorithms
or do you want something that can be done with minimum effort?

For one set of answers, I'd say to shell out to one of the tools that
any programmer already has available.

For another such set of answers, i'd write code that analyzed the two
files and found all lines that appeared exactly once in each file.
Each of those pairs of lines are called "islands". Now, for each
island, examine the lines immediately preceding and following the
present pair, and if they're identical, add them to the island. If in
so doing, you encounter another island, join the two. When none of the
islands can grow any more, you have one possible mapping of identical
lines between files, and the lines not included are differences.
You're in a good position to process those blocks of differences using
the islands for context.

-- 
DaveA




More information about the Tutor mailing list