[Spambayes] Spam probabilitytest

Kenny Pitt kennypitt at hotmail.com
Fri Oct 31 12:49:09 EST 2003


Jens Rantil wrote:
> Can someone explain the attached screenshot? How come that the
> probability for spam isn't 100%?

[from screenshot: spamcount=6, hamcount=0, spamprob=0.965116]

There is a little more to the probability calculation than just a
straight ham/spam ratio.  There are some adjustment factors that are
applied to compensate for words that have not been seen before in either
the ham or the spam.  The theory here is that just because we haven't
seen a word in a ham message previously doesn't ensure that we never
will.  The calculation takes into account how many total times the word
has been seen.  If your spamcount had been much higher, then the
probability would have been much closer to 100%.  For example, if
spamcount=100 then the probability of spam goes to 0.997760.

-- 
Kenny Pitt




More information about the Spambayes mailing list