[Spambayes] Data file "out of balance"...?
Tony Meyer
tameyer at ihug.co.nz
Sun Jul 25 04:03:41 CEST 2004
> As a result, the ratio of spams to hams in my database
> quickly goes up over 2:1 which I understand from the FAQs is
> not the best way to have things set up.
I wouldn't worry about a 2::1 (or 1::2) imbalance. Anything up to around
5::1 is probably fine - and if you keep getting the results that you want,
then don't worry about the imbalance at all.
> When this happens, I have thought to train on more hams in
> the hope of getting the DB into better "balance" but I can't
> figure out how to train on hams only. I can't move hams from
> the spam folder because none are in there.
>
> What is the best way for me to handle the situation I
> describe?
Put the messages to train in an otherwise empty folder, and use the
SpamBayes Manager dialog's Training tab to "Train Now".
=Tony Meyer
---
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.
More information about the Spambayes
mailing list