Graham's spam filter
oren-py-l at hishome.net
Thu Aug 22 18:38:55 CEST 2002
On Thu, Aug 22, 2002 at 04:50:01PM +0200, Heiko Wundram wrote:
> Well, I'll see what comes out of my efforts. Maybe it'll actually prove
> to be useful.
I sure hope so!
I was wondering about another issue - could this system use decision
feedback? If the system detects an email as having a very low probability of
being spam (e.g. <0.1) it could be fed back into the system to update its
statistics continously without human intervension. I assume that spam
that does pass through will not pass with such low probabilities. More
likely it will have something over 0.5 but not pass the 0.9 threshold needed
to label it as spam.
Decision feedback is powerful but also dangerous - if the system starts
to make systematic errors they will tend to increase. This means that
decision feedback may only be used for nonspam, never for spam because the
most critical failure mode of the system is false positives.
More information about the Python-list