[Spambayes] date for new release to handle image spam?

skip at pobox.com skip at pobox.com
Fri Feb 2 16:27:06 CET 2007


    Seth> Rather than try to imagine which clues will be definitive, I was
    Seth> thinking out loud if we might provide a large number of seemingly
    Seth> unrelated clues and letting the Bayesian classifier look for
    Seth> correlations.

Yeah, but we do need to perform the clue extraction from the image or its
properties. ;-)

    Seth> Maybe things like animation rate, contrast ratio, color bias,
    Seth> ... any actual piece of information that varies from one image to
    Seth> the next.

If people can tell me how to compute any of these metrics using PIL (or
point me to some cookbook sites that describe them), I can put them into SB.

    Seth> Then there are the email specific ones like content transfer
    Seth> encoding of each MIME part, total characters in each MIME part,
    Seth> character set, etc.

I think a fair amount of this stuff is already calculated.  We probably need
a dictionary of synthetic clues written down somewhere we can refer to.

Skip


More information about the SpamBayes mailing list