[Spambayes] Spammer countermeasures against bayesian filters
Tim Peters
tim.one at comcast.net
Tue Jul 29 22:14:10 EDT 2003
[Sean True]
> ...
> Using statistical summaries of message properties has been
> intriguing, but every time I try it it seems to be
>
> 1) marginal and
> 2) not robust for some reason like this
WRT #1, a statistical summary token is just one token, so pretty much can't
have a strong effect. The intent of giving everything the same "weight" is
to avoid creating an especially good thing to attack. Lots of statistical
summary tokens could have a strong effect in concert -- although not
necessarily a good effect.
> I'm suspecting that the purely lexical tokenization tricks may be more
> interesting. Or figuring out a way to recognize 'Nigerian Spam' which
> is the _only_ spam I still see on anything like a regular basis.
Hmm. You need to advertise your email address more <0.9 wink>. Are
Nigerian S[cp]ams getting misclassified for you? That would be interesting.
I haven't trained on one of those in months, cuz they're all nailed for me.
More information about the Spambayes
mailing list