difflib-like library supporting moved blocks detection?

Vlastimil Brom vlastimil.brom at gmail.com
Wed Jul 13 23:13:16 CEST 2011


Hi all,
I'd like to ask about the availability of a text diff library, like
difflib, which would support the detection of moved text blocks.
Currently I am almost happy with the results of
difflib.SequenceMatcher in my app (comparing different versions of
natural language texts), however the only drawback seems to be the
missing detection of moves of the text parts. I was thinking of simply
recomparing the deleted and inserted blocks using difflib again, but
this obviously won't be a general solution.
I found several algorithm discussions, but unfortunately no suitable
python implementation. (E.g. Medite -
http://www-poleia.lip6.fr/~ganascia/Medite_Project - seems to be
implemented in Python but it targets some rather special and probably
much more complex textological issues, than my current needs.)
Does maybe someone know such python library (or possibly a way of
enhancing difflib) for this task (i.e character-wise comparing of
texts - detecting insertion, deletion, substitution and move of text
blocks)?

Thanks in advance,
      Vlastimil Brom



More information about the Python-list mailing list