[Spambayes] RE: chi-combining
Rob W.W. Hooft
Tue Nov 19 10:27:22 2002
Tim Peters wrote:
> In an offline thread with Greg Louis (who's working on bogofilter), I tried
> an experiment using just the S, then just the H, components of our spamprob
> calculation. We currently return (1+S-H)/2. The "justs" result here just
> returns S, the "justh" just returns 1-H. justs is a comparative disaster,
> but the more I stare at it, the more I think justh did surprisingly well:
Try your "invisible ham" spam with this. I'm sure it will score
rock-solid ham. By using "justh" you're basically telling spammers that
you're not sensitive to spam words, as long as there is enough of the
message that looks like ham!
The two cases where this makes a difference are
H=1 S=1 : this is the case I just described: A message that looks
like both ham and spam would be unsure before, but will now
result in a Ham score.
H=0 S=0 : A message that doesn't look like anything seen before used
to result in an unsure, but will now result in a "Spam"
I suspect that 1-H is easier to counter for the ephemeral "smart
spammer" than (1+S-H)/2. It is another form of cancellation disease.
Rob W.W. Hooft || firstname.lastname@example.org || http://www.hooft.net/people/rob/
More information about the Spambayes