[Spambayes] should we warn about grossly mismatched training sets?

Skip Montanaro skip at pobox.com
Sat Sep 27 16:01:14 EDT 2003

    Anthony> Just wondering - should the various user interfaces display a
    Anthony> warning if someone's training set gets seriously out of whack?
    Anthony> Say, more than a 4:1 ratio in the spam/ham ratio (in either
    Anthony> direction)?


If someone has a large number of stored ham or spam and inject that into
whatever SpamBayes app they are using, they may start off with an imbalance
which is next to impossible to overcome with incremental training.


