Graham's spam filter

Anthony no-spam at fuckyou.co.uk
Wed Oct 9 00:37:48 CEST 2002


On Thu, 05 Sep 2002 00:01:42 +0000, Erik Max Francis wrote:

> Aaron Swartz wrote:
> 
>> I've been using bogofilter[1], Eric Raymond's Graham-derived spam filter
>> which threw away base64-encoded data and 90% of all spam that got past the
>> filter was base64-encoded. Therefore, I think that base64 content really
>> needs to be decoded. I wrote a base64-decoding filter in Python for it and
>> the problem has gone away.
> 
> Indeed.  I've been finding very much the same thing with my rule-based filter;
> about 90% of the spam that's getting through is base64 encoded. I haven't yet
> taken the next step of automatically decoding the base64 text parts (and then
> just processing that), but as you have discovered it is an obvious solution to
> the obvious problem.

It may be true that there are more and more spam start to use base64 encoding,
but then very few email encode the entire content, only spams do. Is it really
necessary to decode them?




More information about the Python-list mailing list