[Spambayes] What performance is good?
Erin Lazzaro
hera at optonline.net
Tue Jun 1 22:04:46 EDT 2004
I adjusted my thresholds to get 5% of total incoming in Suspects: 5%
and 16%. I've been running that way for a little over a week, and I've
had no ham in Junk and only 2 ham in Suspects. There were 13 spam in
Inbox, so I'm going to lower that threshold a little.
I think I'm ready to stop being so careful with my training and switch
to incremental. I have one question: If I move a message to Junk on my
PDA, how can I determine later on whether SB actually trained on that
message?
Thanks,
Erin
-----Original Message-----
From: Tony Meyer [mailto:tameyer at ihug.co.nz]
Sent: Thursday, May 13, 2004 10:43 PM
To: 'Erin Lazzaro'; spambayes at python.org
Subject: RE: [Spambayes] What performance is good?
> How much do you expect to see in Junk Suspects?
Good results would be 2-5% of total incoming mail, IMO.
> I get about 15% ham. All the ham goes in Inbox (I had one
> ham in Junk Suspects last week), along with 1 or 2 spam,
> which is beautiful. 25% to 30% of incoming mail goes in Junk
> Suspects. Is that reasonable?
Are you displaying the scores for these messages? If so, do they all
tend
to score over a certain value? You might find, for example, that you
can
simply reduce the spam threshold (say to 80%) and the problem goes away.
What do you have the thresholds set for now? (SpamBayes->SpamBayes
Manager->iltering)
> My database is unbalanced (36 ham/84 spam), so I'm only
> training on ham in Junk Suspects (i.e., hardly ever). Should
> I be seeking ham to train on?
IMO, that's not imbalanced enough to worry about.
=Tony Meyer
---
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
This
way, you get everyone's help, and avoid a lack of replies when I'm busy.
More information about the Spambayes
mailing list