How fuzzy is get_close_matches() in difflib?

Steven D'Aprano steve at REMOVEME.cybersource.com.au
Fri Nov 17 00:09:23 EST 2006


On Thu, 16 Nov 2006 20:19:50 -0800, John Henry wrote:

> I did try them and I am impressed.  It helped me found a lot of useful
> info.   I just want to get a feel as to what constitutes a "match".

The source code has lots of comments, but they don't explain the basic
algorithm (at least not in the difflib.py supplied with Python 2.3).

There is no single diff algorithm, but I believe that the basic idea is to
look for insertions and/or deletions of strings. If you want more
detail, google "diff". Once you have a list of differences, the closest
match is the search string with the fewest differences.

As for getting a feel of what constitutes a match, I really can't make any
better suggestion than just try lots of examples with the interactive
Python shell.



-- 
Steven D'Aprano 




More information about the Python-list mailing list