Testing for changes on a web page (was: how to find difference in number of characters)
Emmanuel Surleau
emmanuel.surleau at gmail.com
Sat Oct 9 09:21:11 EDT 2010
> On Oct 9, 5:41 pm, Stefan Behnel <stefan... at behnel.de> wrote:
> > "Number of characters" sounds like a rather useless measure here.
>
> What I meant by number of characters was the number of edits happened
> between the two versions..Levenshtein distance may be one way for
> this..but I was wondering if difflib could do this
> regards
As pointed out above, you also need to consider how the structure of the web
page has changed. If you are only looking at plain text, the Levenshtein
distance measures the number of edit operations (insertion, deletion or
substition) necessary to transform string A into string B.
Cheers,
Emm
More information about the Python-list
mailing list