[Spambayes] mini-spams

voomp voomp at textrix.co.uk
Fri Sep 10 13:22:17 CEST 2004


Several weeks ago, Tony Meyer wrote:

>Yes, these 'mini-spams' or 'micro-spams' will be the toughest for SpamBayes
>to work with, because there isn't much information - for the most part,
just
>the headers.

>We are looking into ways to better deal with these (although for many the
>headers and whatever body there is does provide enough clues).  The
>'use-bigrams' option might help somewhat (it considers pairs of words as
>well as individual words), as might some of the other options that are off
>by default.

>(Turning on these options in Outlook is a somewhat difficult process.  You
>have to open up the 'default_bayes_customize.ini' file in the SpamBayes
data
>directory (or create it if there isn't one) and add the appropriate
options.
>For example, you'd add

>[Classifier]
>x-use_bigrams:True

>for the bigrams option.

I followed his suggestion, and since then not a single spam has gone
undetected. The false-alarm rate is still near-negligible, even though the
suspect spam level is set to 2% and the certain spam level to 5%.

Best regards,

Doug Richardson












More information about the Spambayes mailing list