Comparing two book chapters (text files)
Tino Wildenhain
tino at wildenhain.de
Thu Feb 5 05:21:44 EST 2009
andrew cooke wrote:
> On Feb 4, 10:20 pm, Nick Matzke <mat... at berkeley.edu> wrote:
>> So I have an interesting challenge. I want to compare two book
>> chapters, which I have in plain text format, and find out (a) percentage
>> similarity and (b) what has changed.
>
> no idea if it will help, but i found this yesterday - http://www.nltk.org/
>
> it's a python toolkit for natural language processing. there's a book
> at http://www.nltk.org/book with much more info.
Also there is difflib in the standard package which can be used
depending on exact definition of "similarity".
Regards
Tino
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3241 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20090205/c66d02f5/attachment-0001.bin>
More information about the Python-list
mailing list