[Spambayes] Data file "out of balance"...?
tameyer at ihug.co.nz
Sun Jul 25 04:03:41 CEST 2004
> As a result, the ratio of spams to hams in my database
> quickly goes up over 2:1 which I understand from the FAQs is
> not the best way to have things set up.
I wouldn't worry about a 2::1 (or 1::2) imbalance. Anything up to around
5::1 is probably fine - and if you keep getting the results that you want,
then don't worry about the imbalance at all.
> When this happens, I have thought to train on more hams in
> the hope of getting the DB into better "balance" but I can't
> figure out how to train on hams only. I can't move hams from
> the spam folder because none are in there.
> What is the best way for me to handle the situation I
Put the messages to train in an otherwise empty folder, and use the
SpamBayes Manager dialog's Training tab to "Train Now".
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.
More information about the Spambayes