[Spambayes] solution for the "spam of the future"?
Tim Stone
tim at fourstonesExpressions.com
Tue Dec 16 17:03:30 EST 2003
On Tue, 16 Dec 2003 15:49:45 -0600, Skip Montanaro <skip at pobox.com> wrote:
> Let's modify your proposal slightly. Suppose we add a "missing: N" clue,
> where N is the number of tokens found in the message but not in the
> training
> database.
It would seem better to me to have a threshold, specified as an option, of
how many words are missing to trigger a specific token, like
"more-than-n-words-missing"
Having "missing 1", "missing 2", "missing 3", ... tokens is probably not
as good an indicator...
That said, at initial training time, and until a database grows to some
reasonable size (see separate thread "how low can you go") this token will
always show up. Therefore, it'd only be a good indicator after a certain
point... Does that limit it's usefulness?
--
Vous exprimer; Exprésese; Te stesso esprimere; Express yourself!
Tim Stone
See my photography at www.fourstonesExpressions.com
See my writing at www.xanga.com/obj3kshun
More information about the Spambayes
mailing list