[Spambayes] Data file "out of balance"...?
sole at soleassociates.com
Sun Jul 25 13:21:03 CEST 2004
When I tried to add to the DB by training only on hams, SB balked. It simply
did not occur to me that I could give it a bunch of hams and one spam, but
that worked just fine.
I sincerely appreciate your assistance,
Sole & Associates, Inc.
Durham, New Hampshire 03824
Email: sole at soleAssociates.com
From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of Tony Meyer
Sent: Saturday, July 24, 2004 10:04 PM
To: 'Kenneth Sole'; spambayes at python.org
Subject: RE: [Spambayes] Data file "out of balance"...?
> As a result, the ratio of spams to hams in my database
> quickly goes up over 2:1 which I understand from the FAQs is
> not the best way to have things set up.
I wouldn't worry about a 2::1 (or 1::2) imbalance. Anything up to around
5::1 is probably fine - and if you keep getting the results that you want,
then don't worry about the imbalance at all.
> When this happens, I have thought to train on more hams in
> the hope of getting the DB into better "balance" but I can't
> figure out how to train on hams only. I can't move hams from
> the spam folder because none are in there.
> What is the best way for me to handle the situation I
Put the messages to train in an otherwise empty folder, and use the
SpamBayes Manager dialog's Training tab to "Train Now".
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.
Spambayes at python.org
Check the FAQ before asking: http://spambayes.sf.net/faq.html
More information about the Spambayes