locating strings approximately

BBands bbands at gmail.com
Thu Jun 29 01:28:54 CEST 2006


I'd like to see if a string exists, even approximately, in another. For
example if "black" exists in "blakbird" or if "beatles" exists in
"beatlemania". The application is to look though a long list of songs
and return any approximate matches along with a confidence factor. I
have looked at edit distance, but that isn't a good choice for finding
a short string in a longer one. I have also explored
difflib.SequenceMatcher and .get_close_matches, but what I'd really
like is something like:

a = FindApprox("beatles", "beatlemania")
print a
0.857

Any ideas?

    jab




More information about the Python-list mailing list