Percentage matching of text

Tim Churches tchur at
Fri Jul 30 17:20:06 CEST 2004

On Fri, 2004-07-30 at 23:52, Bruce Eckel wrote:
> What I'd like to do is find an algorithm that produces the results of
> a text comparison as a percentage-match. Thus I would be able to
> assert that my test samples must match the control sample by at least
> (for example) 83% for the test to pass. Clearly, this wouldn't be a
> perfect test but it would help flag problems, which is primarily what
> I need.
> Does anyone know of an algorithm or library that would do this? Thanks
> in advance.

Python implementations of a range of such algorithms can be found in
Febrl - see section 9.2 of the manual:

I suspect that a simple bigram comparison would meet your needs best. Or
just use the Python difflib module in the standard Python library which
implements the Ratcliff-Obershelp comparator.

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B  EB37 7891 46A9 EAF9 93D0

More information about the Python-list mailing list