[spambayes-dev] Another piece of anecdotal evidence
T. Alexander Popiel
popiel at wolfskeep.com
Wed Jan 14 13:07:06 EST 2004
In the last week or so, I've been noticing a higher rate of
false negatives in my mail. Looking at the clues indicates
that I've got a spam or two mis-trained, but I haven't
bothered to find it, yet (I'm currently in the middle of
restructuring my archives so that I don't have a single
directory with over 100,000 files in it). On the other
hand, it appears that this mis-training is the only reason
I'm getting such a high rate of false negatives, despite
a spam:ham training ratio of 50:1.
That's right. 50:1. More specifically, for the last
four months, I have:
Total: 4694 ham, 39913 spam (89.48% spam)
Trained: 204 ham, 10994 spam (98.18% spam)
Having such a high imbalance does seem to make me particularly
susceptible to training errors... but doesn't seem to hurt
otherwise.
- Alex
More information about the spambayes-dev
mailing list