A suggested feature for difflib
Right now, any alternative SequenceMatcher implementation in difflib requires that the implementor also reimplement the diff formatting methods (unified_diff and context_diff). I propose something along these lines (patch to follow if the idea is sound): class Differ: def __init__(self, linejunk=None, charjunk=None, sequence_matcher_factory=SequenceMatcher): # and so on def compare(self, a, b): cruncher = self.sequence_matcher_factory(self.linejunk, a, b) # and so on The same general idea with unified_diff, ndiff, and context_diff Basically, there are two obvious points of extension in difflib: The formatting of the output opcodes, and the method used to find sequences. The latter is not easily extended while reusing the former, and I'd like to change that. Note that I'm asking because I'd like to do the work, not because I want someone else to do it. Is this a reasonable idea? -- Chris R. ====== Not to be taken literally, internally, or seriously. Twitter: http://twitter.com/offby1
The general idea sounds reasonable to me, but a more appropriate signature addition may be "line_matcher=SequenceMatcher, char_matcher=SequenceMatcher". As the Differ docs point out: "Differ uses SequenceMatcher both to compare sequences of lines, and to compare sequences of characters within similar (near-matching) lines." Cheers, Nick.
participants (2)
-
Chris Rose
-
Nick Coghlan