[Spambayes] Bayesian drawbacks?
Brendon
spambayes at whateley.com
Mon Sep 29 11:09:14 EDT 2003
Hi Gerrit,
I'm a new convert to Spambayes and I have to tell you that the training is not
much of a drawback. I started getting useful reductions in spam after
training with just a handful of each good and bad email. I didn't do any
training BEFORE I started to use it, I just let it classify all the email as
unsure until I had a few of each. By the time I had trained on 10 each spam
and ham it caught 23% of the spam. The next day I was up to 58% of the spam
caught and by the 3rd day it was catching 98%!! Currently I've trained it on
119 each of spam and ham, all using the web interface and I get only a few
"unsure" emails a day. After 3 days of use I was at 98% of the spam
eliminated and NO FALSE POSITIVES!! My previous product used fixed filters,
got about 60% of the spam with a trickle of false positives.
Since spam is always evolving, I just check the email that is classified as
"unsure" every so often and train on those. To keep the numbers even, I also
train on a matching number of either good or bad email to balance the numbers
of each type. Unlike fixed filters that slowly start to be fooled as
spammers start changing w0rds l1ke th1s, as soon as you train Spambayes on
even one message like this, it KNOWS that words like that never (except in
this email) appear in anything other than spam!
So, since training consists of selecting a radio button on a web page next to
each message I want to train, the training aspect takes a few seconds per
day.
Brendon.
On Monday 29 September 2003 03:37 am, Gerrit Holl wrote:
> Hi,
>
> I am considering to install a spamfilter on my machine. However, I don't
> know which one to choose. I have read about the Bayesian approach Spambayes
> is using. As I understand it, Spambayes needs to be trained in order to be
> useful. Isn't this a major drawback? Is it possible that a non-Bayesian
> approach would much better suit my needs, or did I misunderstand the
> Bayesian technique then?
>
> yours,
> Gerrit Holl.
More information about the Spambayes
mailing list