fuzzysearch: find not exactly what you're looking for!
Tal Einat
taleinat at gmail.com
Thu Feb 12 14:51:17 EST 2015
Hi everyone,
I'd like to introduce a Python library I've been working on for a
while: fuzzysearch. I would love to get as much feedback as possible:
comments, suggestions, bugs and more are all very welcome!
fuzzysearch is useful for searching when you'd like to find
nearly-exact matches. What should be considered a "nearly matching"
sub-string is defined by a maximum allowed Levenshtein distance[1].
This can be further refined by indicating the maximum allowed number
of substitutions, insertions and/or deletions, each separately.
Here is a basic example:
>>> from fuzzysearch import find_near_matches
>>> find_near_matches('PATTERN', 'aaaPATERNaaa', max_l_dist=1)
[Match(start=3, end=9, dist=1)]
The library supports Python 2.6+ and 3.2+ with a single code base. It
is extensively tested with 97% code coverage. There are many
optimizations under the hood, including custom algorithms and C
extensions implemented in C and Cython.
Install as usual:
$ pip install fuzzysearch
The repo is on github:
https://github.com/taleinat/fuzzysearch
Let me know what you think!
- Tal Einat
.. [1]: http://en.wikipedia.org/wiki/Levenshtein_distance
More information about the Python-list
mailing list