[spambayes-dev] Experience with Word Pairs
Tyson Whitehead
twhitehe at uwo.ca
Thu Sep 1 15:01:17 CEST 2005
A while back I noticed that the sentence structure in the random text spam
(with picture advertising) is frequently bad. I figured word pair statistics
might pickup on this, so I enabled it (I get alot of this type of spam). I
seem to recall reading somewhere in the documentation that you guys wanted
feedback from people trying out the classification of word pairs instead of
just words.
I am now getting about 1/2 the amount of spam that I got without the word
pairs. That is, about 20-30 spam messages a week that are not classified as
spam (either ham or unkown) compared to about 50-60. All together, spam
bayes cleans out about 300 spam messages a week from my box.
Nice piece of software guys! My email had pretty much become unusable before
I installed it.
Thanks again! -T
PS: On the subject of spam, I believe it would be a good idea to create a
system that automaticaly replied to (and visits any links in) detected (and
flagged) spam.
This would greatly decrease the economic feasability of spam. Valid responses
would be buried in piles of return spam. Websites would be immediately
DOSed.
The tricky bit would be making sure the system could not be manipulated to
take out legitimate sites.
--
Tyson Whitehead (-twhitehe at uwo.ca -- WSC-)
Computer Engineer Dept. of Applied Mathematics,
Graduate Student- Applied Mathematics University of Western Ontario,
GnuPG Key ID# 0x8A2AB5D8 London, Ontario, Canada
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/spambayes-dev/attachments/20050901/22f2c805/attachment.pgp
More information about the spambayes-dev
mailing list