[Spambayes] lots of unsures, heavily biased towards spam

David Abrahams dave at boost-consulting.com
Mon Feb 5 16:27:25 CET 2007


skip at pobox.com writes:

>     >> If the interface you're using allows you to delete trained mails you
>     >> could also try deleting a bunch of old mails you classified as spam.
>
>     Dave> It does, but I have to confess I don't really understand the
>     Dave> implications of doing so.
>
> I think most people agree that the nature of spam changes over time.

I know that; I meant the technical implications.  In particular,  I
asked:

>  > I know spambayes keeps a database; when I delete already-trained
>  > emails from my xxx-training folders does it forget everything
>  > about those messages and rebuild the database using the other
>  > messages as though from scratch, or is some of the information
>  > about those deleted messages retained?

> when I find my ham:spam ratio getting a bit out-of-whack, I
> generally throw out a few old spams.
>
> I know this won't help you with the imap filter, however...  

Why not?

> I use the train-to-exhaustion script in the contrib directory which
> helps keep my ham:spam ratio tractable.  I have it train with a
> fixed ratio (right now, 2 spams to 1 ham) and have it train from
> newest to oldest messages.  Given a pair of spam and ham mailboxes
> it thus reverses them then trains using 2 spam, 1 ham, 2 spam, 1
> ham, ... until one mailbox is exhausted.  It ignores any remaining
> messages in the other mailbox.  The cycle repeats for any messages
> which weren't correctly scored on the last pass.  Once a message
> scores correctly, it isn't considered again.  If a message scores
> correctly the first time it's tossed out altogether.

Can that procedure be applied to my IMAP folders?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



More information about the SpamBayes mailing list