[Spambayes] Re: What about Ham as Spam?

Kenny Pitt kennypitt at hotmail.com
Thu Mar 18 13:55:33 EST 2004


Skip Montanaro wrote:
>  Bethanie> This message Database only has 62 good and 9 spam - you
should
>  Bethanie> consider performing additional training. ... is what my
page
>  Bethanie> says when I check it. Is it possible that any of my HAM is
>  Bethanie> seen as Spam or vice verse and is there a way to access
that
>  Bethanie> Database to check?

Since I don't have any context for this, I'm not certain what problem
you are actually having.

SpamBayes gives the warning about additional training until you have
trained at least 10 messages of each type.  After that it will give you
a warning about training imbalance if you have more than 5 times as many
messages of one type as you do of the other.

The counts you are referring to include only the messages that you have
told SpamBayes to train on.  It is not the total number of messages that
SpamBayes has classified for you.  While SpamBayes may make mistakes in
classifying messages, there shouldn't be any mistakes in the training
data unless you made a mistake in training.

With 62 good messages and only 9 spam messages trained, it's likely that
you won't get very good accuracy from SpamBayes.  Have you been training
only on messages that were classified as Unsure?  I would recommend
training on additional spam messages from the "classified as Spam"
section until you have roughly equal numbers of good and spam messages.
There is a good bit of discussion about training strategies on the
SpamBayes wiki at http://www.entrian.com/sbwiki/TrainingIdeas if you're
interested.

-- 
Kenny Pitt




More information about the Spambayes mailing list