Is there anyone familiar with pybloom (bloom filter in python)?

Xell Zhang xellzhang at
Sun Jul 8 20:18:36 CEST 2007


I found pybloom module from and
tried to use it for my crawler:)
I want to use it to store the URLs which have been crawled. But when I
insert a URL string I always get a warning and wrong result...

My testing code is quite simple:
from pybloom import CountedBloom
cb = CountedBloom(800000, 4)
print cb.__contains__("BBB")

E:\EclipseWorkspace\demo\src\ DeprecationWarning: 'I' format
requires 0 <= number <= 4294967295
  b = [ord(x) for x in struct.pack ('I', val)]

I will get warning when running the code above.
The output is "1" which means "BBB" is in the set. But actually it is not...
When I use integer for testing it seems right.

I am not familiar with arithmetic and I don't know if I wrote something
Can anyone help me? Thanks!

Zhang Xiao

Junior engineer, Web development

Ethos Tech.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list