Graham's spam filter

Oren Tirosh oren-py-l at
Thu Aug 22 19:54:15 CEST 2002

On Thu, Aug 22, 2002 at 10:29:36AM -0600, Joseph A. Knapka wrote:
> The analyzer takes about two minutes to go through my corpus of
> about 2000 messages. The filter starts and loads the probability
> dictionary in under five seconds. Doesn't seem like a non-starter
> to me :-) 

For a lots of standard mail components the easiest and most robust way 
to interface to them is running an executable separately for each message. 
In this case five seconds startup time may be a bit too much for sites 
with high load.

> (Of course, the user should never have to deal with
> either program, except to configre it. The filter reads from
> a POP3 or IMAP mailbox and writes the spam-free messages
> either to a file or to another "sanitized" SMTP mailbox,
> which is the one the user checks.)

In this model the program is started once for multiple messages so a 
somewhat slower startup is not an issue.


More information about the Python-list mailing list