July 7, 2010
1:47 a.m.
On Tue, Jul 6, 2010 at 7:18 PM, Terry Reedy <tjreedy@udel.edu> wrote:
[Also posted to http://bugs.python.org/issue2986 A much faster way to find the first mismatch would be i = 0 while first[i] == second[i]: i+=1 The match ratio, based on the initial matching prefix only, is spuriously low.
I don't have much experience with the Python sequence matcher, but many classical edit distance and alignment algorithms benefit from stripping any common prefix and suffix before engaging in heavy-lifting. This is trivially optimal for Hamming-like distances and easily shown to be for Levenshtein and Damerau type distances. -Kevin