Fuzzy Lookups

BBands bbands at gmail.com
Mon Jan 30 18:56:40 CET 2006


Diez B. Roggisch wrote:
> I did a levenshtein-fuzzy-search myself, however I enhanced my version by
> normalizing the distance the following way:

Thanks for the snippet. I agree that normalizing is important. A
distance of three is one thing when your strings are long, but quite
another when they are short. I'd been thinking about something along
these lines myself, but hadn't gotten there yet. It'll be interesting
to have a look at the distribution of the normalized numbers, I'd guess
that there may be a rough threshold that effectively separates the
wheat from the chaff.

    jab




More information about the Python-list mailing list