[Spambayes] 15,000 spam and 200 known good

Anthony Baxter anthony at interlink.com.au
Thu Sep 25 07:13:59 EDT 2003

>>> mark_fitzgerald at keybank.com wrote
> I know what my results are, but I just wanted to check to see if this is 
> normal.. If I have a ton of spam, and only a few hundred good messages, it 
> seems that even when I send a known good message to the mailbox, it gets 
> classified as spam -- not maybe spam, but 100% spam.  Is there a way to 
> balance it out?   Should I have to?  I can send a message, do a move to 
> spam, (or move manually and retrain) and then resend the same exact 
> message, and it gets classified as SPAM again.  Any idea what's going on? 

This is a known issue - spambayes works best when trained with approximately
similar numbers of spam and ham. If you've trained on such a grossly mismatched
test set, your database is going to think almost everything is spam. Your best
bet is to delete your database and train on a much smaller set of spam.

Anthony Baxter     <anthony at interlink.com.au>   
It's never too late to have a happy childhood.

More information about the Spambayes mailing list