[Spambayes] solution for the "spam of the future"?

Tim Stone tim at fourstonesExpressions.com
Tue Dec 16 17:03:30 EST 2003


On Tue, 16 Dec 2003 15:49:45 -0600, Skip Montanaro <skip at pobox.com> wrote:

> Let's modify your proposal slightly.  Suppose we add a "missing: N" clue,
> where N is the number of tokens found in the message but not in the 
> training
> database.

It would seem better to me to have a threshold, specified as an option, of 
how many words are missing to trigger a specific token, like 
"more-than-n-words-missing"

Having "missing 1", "missing 2", "missing 3", ... tokens is probably not 
as good an indicator...

That said, at initial training time, and until a database grows to some 
reasonable size (see separate thread "how low can you go") this token will 
always show up.  Therefore, it'd only be a good indicator after a certain 
point... Does that limit it's usefulness?

-- 

Vous exprimer; Exprésese; Te stesso esprimere; Express yourself!
Tim Stone
See my photography at www.fourstonesExpressions.com
See my writing at www.xanga.com/obj3kshun



More information about the Spambayes mailing list