[Spambayes] Contacts whitelist

Tim Peters tim.one at comcast.net
Fri May 28 15:29:40 EDT 2004


[Walter Görlitz]
> well a worm is not technically a SPAM message and that's the problem.
> People classify all unwanted mail as SPAM when the original connotation
> was unsolicited commercial mail.

SpamBayes has no inherent definition of spam:  all tokens start out at a
neutral 0.5 in this code, and all ideas about what "spam" and "ham" mean
come from the training you give it.  If news about Java is spam to you, or
viruses are ham to you, fine -- the classification engine has no beliefs of
its own.

That said, I believe there are more *effective* ways to catch virus email
than with this system.  SpamBayes looks at email content, with all features
treated the same (equal weight), and typical virus email has fewer features
*to* look at than typical UCE (unsolicited commercial email).  Fewer
features generally lead to more moderate scores (less evidence -> less
confidence), so even blatant virus email can easily end up rated Unsure in
this system.  I'm surprised that it catches as much virus email as it does
for me.

...

>> So far, I have not received any false positives for emails from friends.

> If you're using SpamBayes to filter out all unwanted mail, not just
> SPAM, you may get too many false negatives, but that's up to you.

The definition of spam in my personal classifier is "email Tim doesn't want
to see", which includes spam, viruses, bounces from viruses forged to appear
as if they came from me, and so on.  I haven't noticed any problem with FN
rate as a result, although msgs *appearing* to come from people I know rate
Unsure more often than I'd like.  That appears to be because there's been a
huge increase over the last month in the rate of viruses appearing to come
from people I know.  One of those was FN near the end of last month, but
that's it.





More information about the Spambayes mailing list