[spambayes-dev] Experience with Word Pairs

Tyson Whitehead twhitehe at uwo.ca
Thu Sep 1 15:01:17 CEST 2005


A while back I noticed that the sentence structure in the random text spam 
(with picture advertising) is frequently bad.  I figured word pair statistics 
might pickup on this, so I enabled it (I get alot of this type of spam).  I 
seem to recall reading somewhere in the documentation that you guys wanted 
feedback from people trying out the classification of word pairs instead of 
just words.

I am now getting about 1/2 the amount of spam that I got without the word 
pairs.  That is, about 20-30 spam messages a week that are not classified as 
spam (either ham or unkown) compared to about 50-60.  All together, spam 
bayes cleans out about 300 spam messages a week from my box.

Nice piece of software guys!  My email had pretty much become unusable before 
I installed it.

Thanks again!  -T

PS:  On the subject of spam, I believe it would be a good idea to create a 
system that automaticaly replied to (and visits any links in) detected (and 
flagged) spam.

This would greatly decrease the economic feasability of spam.  Valid responses 
would be buried in piles of return spam.  Websites would be immediately 
DOSed.

The tricky bit would be making sure the system could not be manipulated to 
take out legitimate sites.

-- 
 Tyson Whitehead  (-twhitehe at uwo.ca -- WSC-)
 Computer Engineer                          Dept. of Applied Mathematics,
 Graduate Student- Applied Mathematics      University of Western Ontario,
 GnuPG Key ID# 0x8A2AB5D8                   London, Ontario, Canada
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/spambayes-dev/attachments/20050901/22f2c805/attachment.pgp


More information about the spambayes-dev mailing list