Regular expression as dictionary key?
LLoeffler at home.com
Mon Dec 3 19:33:27 EST 2001
> I wonder if there's a way of precomputing the keys so that doing fuzzy
> searching on them won't be so expensive. The difflib module allows
> fuzzy searching, but I can imagine that it might be expensive to use.
I'd have to think about it some more. Thanks for bringing up
difflib--being new to python I didn't know it existed. That looks very
cool but as you pointed out potentially expensive with quadratic (avg)
performance. I'm basically screwing around with profiling spam by word
histograms. In the best case, the word is already in the system and it
returns with hash table speed. If not--then this fuzzy matching kicks
in (difflib looks great for that) to try to get a match for mispellings
or plural forms (more likely). As the db gets bigger and more
complete, there should be fewer misses. So actually, speed may not be a
great problem. Anyway, it's silly for me to be thinking about speed too
much right now with a lot left unfinished.
More information about the Python-list