[Tutor] longest common substring

Jerry Hill malaclypse2 at gmail.com
Fri Nov 11 16:53:32 CET 2011


There's nothing wrong with writing your own code to find the longest common
substring, but are you aware that python has a module in the standard
library that already does this?  In the difflib module, the SequenceMatcher
class can compare two sequences and extract the longest common sequence of
elements from it, like this:

Code:
import difflib

a = [1, 2, 3, 7]
b = [2, 3, 7]

seq_matcher = difflib.SequenceMatcher(None, a, b)
print seq_matcher.find_longest_match(0, len(a), 0, len(b))

Outputs:
Match(a=1, b=0, size=3)

See http://docs.python.org/library/difflib.html#sequencematcher-objects for
lots of details.  The SequenceMatcher class can do a lot more than finding
the common substrings, as you might expect.  The Module of the Week article
for difflib may also be of interest:
http://www.doughellmann.com/PyMOTW/difflib/index.html

-- 
Jerry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20111111/70824634/attachment.html>


More information about the Tutor mailing list