[Spambayes] RE: solution for the "spam of the future"?

Kenny Pitt kennypitt at hotmail.com
Tue Dec 16 17:27:05 EST 2003


Coe, Bob wrote:
> Don't start generating the "Missing: N" token until the database is
> large enough for it to make sense. 

If this works at all, it also seems like the *percentage* of unknown
word tokens in the message would work better than a log()'d count.  A
very large newsletter is pretty much guaranteed to have a higher *count*
of unknown tokens than a short mailing list message, but that's because
it has more total tokens and not because it's any spammier.

-- 
Kenny Pitt




More information about the Spambayes mailing list