Efficiently determine where documents differ
richardbp at gmail.com
Mon Jan 4 23:04:12 CET 2010
I have been using the difflib library to find where 2 large HTML
documents differ. The Differ().compare() method does this, but it is
very slow - atleast 100x slower than the unix diff command.
How can I efficiently determine where 2 documents differ in Python?
(Ideally I am after the positions rather the actual text, which is
what SequenceMatcher().get_opcodes() returns.)
More information about the Python-list