[Spambayes] General training questions

Hans Henderson hans at pobox.com
Tue Oct 12 02:48:34 CEST 2004


Sorry to re-post if it's been answered elsewhere, but I believe others have 
also asked similar questions I didn't see answered.

So far my intention is to put all "unsure" and all wrongly classified mail 
into the training folders.

The docs state to try to keep the quantity of spam and ham  trained 
balanced, but I get a LOT more spam than ham (like 2000:1).

I've done this with the initial training set, but unfortunately my ham 
sample was much older than the spam, as I don't keep old spam <g>

But ongoing, is there any harm in the ratio in the database getting 
unbalanced from putting a larger amount of spam than ham from the unsure 
folder into the training folders?

Or is it better to deny SB the training fodder?





More information about the Spambayes mailing list