[Spambayes] Marking message flagged as spam as non spam
Amedee Van Gasse (amedee.be)
amedee at amedee.be
Tue Apr 7 09:15:17 CEST 2009
On Tue, April 7, 2009 07:10, Thomas Hruska wrote:
> skip at pobox.com wrote:
>> Keith> How many hams and spams have you trained on?
>> Keith> -Quite a few , around 350 spam mails, hams around 4500.
>> This is way out-of-balance. Typically SpamBayes works best with
>> roughly equal numbers of ham and spam.
I have 14005 spam and 2679 ham. That's way out-of-balance too, but I can't
say that Spambayes isn't working good enogh for me. No complaints here.
> While I agree that this is out of balance, Spambayes seriously needs to
> get its act together and stop allowing users to train on imbalances or
> messages classified correctly and allows users to reset the database
> periodically (the POP3 proxy server seriously needs a feature that allows
> you to do a complete reset of the database within the UI itself).
I use the procmail filter on Linux. No fancy GUI for me. I have never
reset my db, but I think it's just a simple matter of rm'ing the db file.
> The rule of thumb I follow is: Train on only one spam in ham and one
> ham in unsure. Skip training on messages I plan on filtering using my
> e-mail client (i.e. no point in training on messages I'm going to
> whitelist in the first place).
That's what I do too. I have some procmail rules that come before the
> Once I reach about 300 of each type, reset
> the database and start over.
> My problem is that 99.9% of my incoming mail is spam, so there is an
> imbalance by default. I am forced to delete unsures because the imbalance
> is so great. IMO, 'unsure' is an inappropriate word choice for the
> category. It causes many users to feel they need to tell Spambayes what
> is ham and spam. This, in turn, creates the imbalances they then
> When was the last update to Spambayes? Time for a new version!
Are you using the stable version or the beta version?
More information about the SpamBayes