[Python-Dev] The first trustworthy <wink> GBayes results
Eric S. Raymond
esr@thyrsus.com
Thu, 29 Aug 2002 13:13:07 -0400
Tim Peters <tim.one@comcast.net>:
> Spammers often generate random "word-like" gibberish at the ends of msgs,
> and "rd" is one of the random two-letter combos that appears in the spam
> corpus. Perhaps it would be good to ignore "words" with fewer than W
> characters (to be determined by experiment).
Bogofilter throws out words of length one and two.
> I expect that including the headers would have given these much better
> chances of getting through, given Robin and Alex's posting histories.
> Still, the idea of counting words multiple times is open to question, and
> experiments both ways are in order.
And bogofilter includes the headers. This is important, since
otherwise you don't rate things like spamhaus addresses and sender
names.
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>