Diez B. Roggisch
deets at nospam.web.de
Mon Jan 30 17:30:06 CET 2006
Fredrik Lundh wrote:
> Diez B. Roggisch wrote:
>> The advantage becomes apparent when you try to e.g. compare
>> "Angelina Jolie"
>> Both have a l-dist of 3
>>>> distance("Angelina Jolie", "AngelinaJolei")
>>>> distance("Angelina Jolie", "Bob")
> what did I miss ?
Hmm. I missed something - the "1" before the "3" in 13 when I looked on my
terminal after running the example. And according to
it has the property
"""It is always at least the difference of the sizes of the two strings."""
And my implementation I got from there (or better from Magnus Lie Hetland
whoms python version is referenced there)
So you are right, my example is crap.
But I ran into cases where my normalizing made sense - otherwise I wouldn't
have done it :)
I guess it is more along the lines of (coughed up example)
I can only say that I used it to fuzzy-compare people's and hotel names, and
applying the normalization made my results by far better.
Sorry to cause confusion.
More information about the Python-list