Graham's spam filter

Oren Tirosh oren-py-l at hishome.net
Thu Aug 22 18:38:55 CEST 2002


On Thu, Aug 22, 2002 at 04:50:01PM +0200, Heiko Wundram wrote:
> Well, I'll see what comes out of my efforts. Maybe it'll actually prove
> to be useful.

I sure hope so!

I was wondering about another issue - could this system use decision 
feedback? If the system detects an email as having a very low probability of
being spam (e.g. <0.1) it could be fed back into the system to update its 
statistics continously without human intervension. I assume that spam 
that does pass through will not pass with such low probabilities. More 
likely it will have something over 0.5 but not pass the 0.9 threshold needed
to label it as spam.

Decision feedback is powerful but also dangerous - if the system starts
to make systematic errors they will tend to increase.  This means that 
decision feedback may only be used for nonspam, never for spam because the 
most critical failure mode of the system is false positives.

	Oren




More information about the Python-list mailing list