[Spambayes] What performance is good?

Erin Lazzaro hera at optonline.net
Tue Jun 1 22:04:46 EDT 2004


I adjusted my thresholds to get 5% of total incoming in Suspects:  5%
and 16%.  I've been running that way for a little over a week, and I've
had no ham in Junk and only 2 ham in Suspects.  There were 13 spam in
Inbox, so I'm going to lower that threshold a little.

I think I'm ready to stop being so careful with my training and switch
to incremental.  I have one question:  If I move a message to Junk on my
PDA, how can I determine later on whether SB actually trained on that
message?

Thanks,
Erin

-----Original Message-----
From: Tony Meyer [mailto:tameyer at ihug.co.nz] 
Sent: Thursday, May 13, 2004 10:43 PM
To: 'Erin Lazzaro'; spambayes at python.org
Subject: RE: [Spambayes] What performance is good?

> How much do you expect to see in Junk Suspects?

Good results would be 2-5% of total incoming mail, IMO.

> I get about 15% ham.  All the ham goes in Inbox (I had one 
> ham in Junk Suspects last week), along with 1 or 2 spam, 
> which is beautiful.  25% to 30% of incoming mail goes in Junk 
> Suspects.  Is that reasonable?

Are you displaying the scores for these messages?  If so, do they all
tend
to score over a certain value?  You might find, for example, that you
can
simply reduce the spam threshold (say to 80%) and the problem goes away.
What do you have the thresholds set for now?  (SpamBayes->SpamBayes
Manager->iltering)

> My database is unbalanced (36 ham/84 spam), so I'm only 
> training on ham in Junk Suspects (i.e., hardly ever).  Should 
> I be seeking ham to train on?

IMO, that's not imbalanced enough to worry about.

=Tony Meyer

---
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
This
way, you get everyone's help, and avoid a lack of replies when I'm busy.





More information about the Spambayes mailing list