[spambayes-dev] Another piece of anecdotal evidence

T. Alexander Popiel popiel at wolfskeep.com
Wed Jan 14 13:07:06 EST 2004


In the last week or so, I've been noticing a higher rate of
false negatives in my mail.  Looking at the clues indicates
that I've got a spam or two mis-trained, but I haven't
bothered to find it, yet (I'm currently in the middle of
restructuring my archives so that I don't have a single
directory with over 100,000 files in it).  On the other
hand, it appears that this mis-training is the only reason
I'm getting such a high rate of false negatives, despite
a spam:ham training ratio of 50:1.

That's right.  50:1.  More specifically, for the last
four months, I have:

Total:    4694 ham, 39913 spam (89.48% spam)
Trained:   204 ham, 10994 spam (98.18% spam)

Having such a high imbalance does seem to make me particularly
susceptible to training errors... but doesn't seem to hurt
otherwise.

- Alex



More information about the spambayes-dev mailing list