[Spambayes] Marking message flagged as spam as non spam

Amedee Van Gasse (amedee.be) amedee at amedee.be
Tue Apr 7 09:15:17 CEST 2009

On Tue, April 7, 2009 07:10, Thomas Hruska wrote:
> skip at pobox.com wrote:
>> Keith> How many hams and spams have you trained on?
>> Keith> -Quite a few , around 350 spam mails, hams around 4500.
>> This is way out-of-balance.  Typically SpamBayes works best with
>> roughly equal numbers of ham and spam.

I have 14005 spam and 2679 ham. That's way out-of-balance too, but I can't
say that Spambayes isn't working good enogh for me. No complaints here.

> While I agree that this is out of balance, Spambayes seriously needs to
> get its act together and stop allowing users to train on imbalances or
> messages classified correctly and allows users to reset the database
> periodically (the POP3 proxy server seriously needs a feature that allows
> you to do a complete reset of the database within the UI itself).

I use the procmail filter on Linux. No fancy GUI for me. I have never
reset my db, but I think it's just a simple matter of rm'ing the db file.

> The rule of thumb I follow is:  Train on only one spam in ham and one
> ham in unsure.  Skip training on messages I plan on filtering using my
> e-mail client (i.e. no point in training on messages I'm going to
> whitelist in the first place).

That's what I do too. I have some procmail rules that come before the
Spambayes incantation.

>  Once I reach about 300 of each type, reset
> the database and start over.
> My problem is that 99.9% of my incoming mail is spam, so there is an
> imbalance by default.  I am forced to delete unsures because the imbalance
> is so great.  IMO, 'unsure' is an inappropriate word choice for the
> category.  It causes many users to feel they need to tell Spambayes what
> is ham and spam.  This, in turn, creates the imbalances they then
> experience.
> When was the last update to Spambayes?  Time for a new version!

Are you using the stable version or the beta version?


