Regular expression as dictionary key?

Luke LLoeffler at home.com
Mon Dec 3 19:33:27 EST 2001


> I wonder if there's a way of precomputing the keys so that doing fuzzy
> searching on them won't be so expensive.  The difflib module allows
> fuzzy searching, but I can imagine that it might be expensive to use.


I'd have to think about it some more.  Thanks for bringing up 
difflib--being new to python I didn't know it existed.  That looks very 
cool but as you pointed out potentially expensive with quadratic (avg) 
performance.  I'm basically screwing around with profiling spam by word 
histograms.  In the best case, the word is already in the system and it 
returns with hash table speed.  If not--then this fuzzy matching kicks 
in (difflib looks great for that) to try to get a match for mispellings 
  or plural forms (more likely).  As the db gets bigger and more 
complete, there should be fewer misses.  So actually, speed may not be a 
great problem.  Anyway, it's silly for me to be thinking about speed too 
much right now with a lot left unfinished.

Luke




More information about the Python-list mailing list