creating bitmasks (for bloom filters)

Aaron Straup Cope asc at
Sat Jun 26 04:56:03 CEST 2004


Included below is a copy of a message I sent to the py-tutor list. It
didn't garner much in the way of a solution and it was suggested that
this list might be helpful.


-----Forwarded Message-----
> From: Aaron Straup Cope <asc at>
> To: pytutor <tutor at>
> Subject: creating bitmasks (for bloom filters)
> Date: Fri, 11 Jun 2004 08:49:20 -0400
> Hi,
> A quick introduction : I am mostly a Perl hacker, but I am teaching
> myself Python because it seems like the right tool for the job in
> question. [1] I am not, by any stretch, much of a math weenie.
> I am trying to port the Bloom::Filter Perl module [2] for use with my
> tool. I know that there is a "pybloom" sourceforge project [3] but
> licensing issues aside it's reliance on Unix math libs makes it
> problematic for use with a cross-platform GUI tool.
> At this point, my not very deep grasp of Python and my poor
> understanding of some basic compsci/math principles have combined to
> frustrate my efforts.
> I was hoping that the the Perl/Python operator and function set would be
> similar enough that, with only a few modifications, I could clone the
> package pretty much "as is". All other bugs aside, I am stuck
> trying to create a bitmask of a key :
> "Given a key, hashes it using the list of salts and returns a bitmask
> the same length as the Bloom filter.  Note that Perl will pad the 
> bitmask out with zeroes so it's a muliple of 8."
>    # from the docs for the _make_bitmask method
> Specifically, it's not clear to me :
>  * how to map the various Perl pack/unpack formats to their Python
>    equivalents (things like Perl's "b*" which is "a bit string, in 
>    ascending bit order inside each byte" where "*" means to use all
>    the bytes from the input field")
>  * how to account for the fact that unpack returns a list in Perl and
>    Python returns a string
>  * how to pass a salt to the Python sha module
>  * how to XOR the pieces of an unpacked sha digest (which may simply
>    be a function of a bad format unpacking stuff)
>  * A 'vec' [4] function in Python; does one exist?
> In short, I don't really have any idea what I'm doing :-(
> As much as I relish the challenge purely on the grounds of learning new
> stuff I am starting to reach the point of irrational frustration and
> wanting to just get this done so I can move on to other things. 
> The tool should be written in Python because it's the best choice for
> the job. The tool needs to be able to read/write Bloom Filters. Python
> needs a Bloom Filter library, so I will step up and write it. Except it
> appears to be beyond my skill level right now...
> Any thoughts, suggestions or clue-bats would be very much welcomed.
> Pointers to an already existing implementation that I've missed would be
> even better but I'll take what I can get. :-)
> Thanks,
> ---
> [1]
> [2]
> [3]
> [4]

More information about the Python-list mailing list