[Spambayes] 15,000 spam and 200 known good
Anthony Baxter
anthony at interlink.com.au
Thu Sep 25 07:13:59 EDT 2003
>>> mark_fitzgerald at keybank.com wrote
> I know what my results are, but I just wanted to check to see if this is
> normal.. If I have a ton of spam, and only a few hundred good messages, it
> seems that even when I send a known good message to the mailbox, it gets
> classified as spam -- not maybe spam, but 100% spam. Is there a way to
> balance it out? Should I have to? I can send a message, do a move to
> spam, (or move manually and retrain) and then resend the same exact
> message, and it gets classified as SPAM again. Any idea what's going on?
This is a known issue - spambayes works best when trained with approximately
similar numbers of spam and ham. If you've trained on such a grossly mismatched
test set, your database is going to think almost everything is spam. Your best
bet is to delete your database and train on a much smaller set of spam.
Anthony
--
Anthony Baxter <anthony at interlink.com.au>
It's never too late to have a happy childhood.
More information about the Spambayes
mailing list