[Spambayes] training issues
bill at parducci.net
Thu Mar 13 08:24:15 EST 2003
i receive a couple of newsletters and [travel] updates that i cannot get trained properly for the life of me. every time one of them comes in it is classified as spam (high 90s not uncommon) and dumped into my spam folder.
i moved the message into my inbox and fired off mboxtrain each time this happens. in looking at the note afterwards i see
in the header.
however, the next time that a similar message arrives, it is dumped into spam. my guess is that the weighting of the content (e.g. state department travel warnings bear a tremendous degree of similarity with the scams from nigeria if you just look at the occurrences of 'low freq' words) overcomes the effect of the training (which i am guessing acts by raising the header information to high ham probabilities as a result of much of the other information being previously trained as spam).
the bottom line is that i am not sure how to correct for this. suggestions?
More information about the Spambayes