don't need dictionary's keys - hash table?

Paul Rubin http
Thu Jul 13 08:12:04 EDT 2006


Paul Rubin <http://phr.cx@NOSPAM.invalid> writes:
> Finally, I think to use Bloom filters effectively you have to choose
> nlayers and layersize carefully, according to the number of keys you
> expect to see, the amount of memory you have available, and/or the
> probability of false matches that you're willing to accept.  The
> standard references (whatever they are) about Bloom filters probably
> have some suitable formulas for computing these things.

Heh, there's an online calculator:

http://www.cc.gatech.edu/fac/Pete.Manolios/bloom-filters/calculator.html

Link is from 

http://en.wikipedia.org/wiki/Bloom_filter

which is a good article.  As described in the article, there's only
one "layer" in the Bloom filter (you use n different hashes but they
all touch the same bit map) which corresponds to how the ancient Unix
spell checker worked, I think.  It would take some analysis to show
which way is better.  I may try working out the formulas as an
exercise but I'm too sleepy for that right now.



More information about the Python-list mailing list