[Spambayes] four questions about 'selfless' spam

Tim Peters tim.peters at gmail.com
Mon Jan 10 19:49:21 CET 2005


[Danny Friedmann]
> As a freelance journalist in the Netherlands, I am writing an article
> about 'selfless' spam and ways to fight it. I want to ask you what
> Spambayes does against this phenomenon.

Nothing special.  SpamBayes is trained by an individual, according to
an individual's beliefs about the difference between "ham" and "spam".
 There is no collection of random words that has a good chance of
fooling many SpamBayes installations, and it's particularly difficult
to guess an individual's "ham" words.

> 1. Are there other names selfless spam (spam that wants to confuse the
> filters by sending plain nonsense words) is known under?

I don't really know what you mean by "selfless spam", or by "nonsense
words".  If they're legitimate words, but more or less chosen at
random, then that's often called "word salad".  If you mean gibberish
strings of letters that aren't really words at all, then SpamBayes
scores them at 0.5 (exactly halfway between ham and spam), as it does
for any word it hasn't seen before.  They don't affect the overall
scoring then.

> 2. Is it a big problem in the English speaking world?

Neither word salad nor gibberish (pseudo-)words are effective against
SpamBayes in general.

> 3. Is Spambayes moving into a semantic analyses direction?

No.

> 4. Do you know of interesting articles or white papers about it online?

Sorry, not offhand -- try Google.


More information about the Spambayes mailing list