[Spambayes] More "spam of the future" lately?
Michael N. Nitabach
mnitabach at acedsl.com
Wed Dec 17 16:00:39 EST 2003
> -----Original Message-----
> From: Tim Peters [mailto:tim.one at comcast.net]
> Sent: Wednesday, December 17, 2003 3:42 PM
> To: Michael N. Nitabach; spambayes at python.org
> Subject: RE: [Spambayes] More "spam of the future" lately?
> >> 0.7 maybe, but you'd eventually regret dropping
> [spam_cutoff] to 0.5.
> [Michael N. Nitabach]
> > What makes you say that? I have my certain-spam cutoff at .30, and
> > my uncertain at .01. My training database has about 8000 hams and
> > 3000 spams. I have only ever received ten hams that scored over
> > .01, and only one over .20.
> Unless you've eyeballed every message scored as spam, then it's almost
> certain you've suffered false positives due to those
I just looked in my certain-spam folder at all e-mails that scored below 0.70. Only a single one was a false positive: a SpamBayes mailing list digest that contained a complete actual spam e-mail that someone had posted, which scored 0.49.
> There's more
> info on the project's background page:
> Note especially the third graph. The way spamprobs are combined in
> SpamBayes guarantees that a highly ambiguous message will
> score very near
> 0.5 (explained in more detail before the third graph, and much more at
I receive a substantial amount of e-mail that scores between 0.30 and 0.70, but so far it has *all* been spam.
> The kinds of email people get vary widely, though, and it's
> possible your
> mix is extremely well-suited to this classifier, devoid of
> any significant
Well, the interesting thing is that a lot of my spam is relatively technical sales-pitch e-mail that is talking about the same sorts of things that I talk about in my ham professional e-mails.
> (I'll note that if you use your SpamBayes'd email only for
> professional purposes, and no personal ones (like chatting
> with friends and
> relatives), it doesn't strain my imagination that your ham
> could be *so*
> uniform that ambiguity doesn't arise -- but then your email
> mix would be
> atypical too.)
No, I use it for equal parts professional and personal correspondence.
Michael N. Nitabach, Ph.D., J.D.
Department of Cellular and Molecular Physiology
Yale University School of Medicine
mnitabach at acedsl.com
More information about the Spambayes