[Spambayes] lots of unsures, heavily biased towards spam
David Abrahams
dave at boost-consulting.com
Mon Feb 5 16:27:25 CET 2007
skip at pobox.com writes:
> >> If the interface you're using allows you to delete trained mails you
> >> could also try deleting a bunch of old mails you classified as spam.
>
> Dave> It does, but I have to confess I don't really understand the
> Dave> implications of doing so.
>
> I think most people agree that the nature of spam changes over time.
I know that; I meant the technical implications. In particular, I
asked:
> > I know spambayes keeps a database; when I delete already-trained
> > emails from my xxx-training folders does it forget everything
> > about those messages and rebuild the database using the other
> > messages as though from scratch, or is some of the information
> > about those deleted messages retained?
> when I find my ham:spam ratio getting a bit out-of-whack, I
> generally throw out a few old spams.
>
> I know this won't help you with the imap filter, however...
Why not?
> I use the train-to-exhaustion script in the contrib directory which
> helps keep my ham:spam ratio tractable. I have it train with a
> fixed ratio (right now, 2 spams to 1 ham) and have it train from
> newest to oldest messages. Given a pair of spam and ham mailboxes
> it thus reverses them then trains using 2 spam, 1 ham, 2 spam, 1
> ham, ... until one mailbox is exhausted. It ignores any remaining
> messages in the other mailbox. The cycle repeats for any messages
> which weren't correctly scored on the last pass. Once a message
> scores correctly, it isn't considered again. If a message scores
> correctly the first time it's tossed out altogether.
Can that procedure be applied to my IMAP folders?
--
Dave Abrahams
Boost Consulting
www.boost-consulting.com
More information about the SpamBayes
mailing list