[Spambayes] Data file "out of balance"...?

Kenneth Sole sole at soleassociates.com
Sun Jul 25 13:21:03 CEST 2004

Hi Tony,

When I tried to add to the DB by training only on hams, SB balked. It simply
did not occur to me that I could give it a bunch of hams and one spam, but
that worked just fine.

I sincerely appreciate your assistance,



   Sole & Associates, Inc.
   Box 292
   Durham, New Hampshire 03824

 Voice: 603-659-3169
   Fax: 603-659-2248
 Email: sole at soleAssociates.com
   URL: http://www.soleAssociates.com
   PGP: http://wwwkeys.ch.pgp.net:11371/pks/lookup?op=get&search=0xE17941C6

-----Original Message-----
From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of Tony Meyer
Sent: Saturday, July 24, 2004 10:04 PM
To: 'Kenneth Sole'; spambayes at python.org
Subject: RE: [Spambayes] Data file "out of balance"...?

> As a result, the ratio of spams to hams in my database 
> quickly goes up over 2:1 which I understand from the FAQs is 
> not the best way to have things set up.

I wouldn't worry about a 2::1 (or 1::2) imbalance.  Anything up to around
5::1 is probably fine - and if you keep getting the results that you want,
then don't worry about the imbalance at all.

> When this happens, I have thought to train on more hams in 
> the hope of getting the DB into better "balance" but I can't 
> figure out how to train on hams only. I can't move hams from 
> the spam folder because none are in there.
> What is the best way for me to handle the situation I 
> describe?

Put the messages to train in an otherwise empty folder, and use the
SpamBayes Manager dialog's Training tab to "Train Now".

=Tony Meyer

Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.

Spambayes at python.org
Check the FAQ before asking: http://spambayes.sf.net/faq.html

More information about the Spambayes mailing list