[Spambayes] training problem?

Ryan Malayter rmalayter at bai.org
Wed Dec 3 14:11:46 EST 2003

Seth Goodman wrote:
> 2) What training tactics would you suggest that might work better?

I've recently done a few things to balance my training ratio, and the
initial results are encouraging.

Have my "spam" folder, with 773 messages in it, all less than a month
old. I then use Outlook to do a search of all my mail folders *except*
my spam folder (this is easy in Outlook 2002 and up, because you can
exclude individual folders from search), for all mail messages newer
than a month old. I move the "cutoff date" on this search until the
number of messages returned by the search is very close to the number in
my spam folder.

I then *copy* all the messages from this search into a temporary Outlook
folder called "Ham for training". Then, I train on this folder and my
spam folder, rebuilding the database from scratch. I set my thresholds
to 20/80, and train appropriately on all spam or ham that falls in the
middle spams.

I'll add this to your Wiki...


More information about the Spambayes mailing list