Graham's spam filter

Heiko Wundram heikowu at ceosg.de
Fri Aug 23 13:40:18 EDT 2002


Am Don, 2002-08-22 um 23.15 schrieb Karl Vogel:
>    If you can store each message as a separate file (perhaps by using a
>    Maildir setup), ifile can scan your entire mailbox in a hurry.  I use
>    a script to download my mail periodically, break it into separate
>    messages, and scan 200-500 messages at once.  It rarely takes more than
>    2-3 seconds on a Sun Ultra-10 workstation to accurately identify the
>    spam messages and move them elsewhere.

Might be true, but if you put something like the SPAM classification in
front of a Mailing list, it'll be hard to collect messages... In this
case you'd rather have the program called from a sendmail delivery agent
(e.g. procmail), and having it start each and every time is not really
feasible; on a busy server, keeping the databases in memory is much more
effective (that's the basic client/server model).

>    Instead of writing yet another POP server, why not use a pipeline
>    something like this?

Problem with this: People running Windows (for whom I'm designing this)
rarely run an MTA. The only choice you have on such a platform is
basically to plug something into the mail reader (not portable, we have
Netscape Mail/Outlook/Outlook Express/Eudora scattered everywhere here),
or, what my idea was, to give them a local POP3 server which dispatches
their mail into separate user accounts after running it through the
Bayesian Filter, where a user account is basically a folder.

They can then download their emails happily using any tool they like.

>    The first and third parts are already done; qmail will deliver the
>    message as a single file, and the qmail POP-server will deliver it.

The problem is: On Unix I agree, on Windows there is no such thing. And
I'm not planning on installing cygwin on each and every of the PCs I'll
outfit with the program once it's done, and give them a local MTA...

Well, hope this makes my thoughts a little clearer...

Yours,

	Heiko Wundram
	Netzwart Wohnheim-D
	Universität 18 - Zimmer 2206 - Saarbrücken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
URL: <http://mail.python.org/pipermail/python-list/attachments/20020823/f3e195bf/attachment.sig>


More information about the Python-list mailing list