Graham's spam filter

Paul Rubin phr-n2002b at NOSPAMnightsong.com
Thu Aug 22 18:54:04 EDT 2002


Heiko Wundram <heikowu at ceosg.de> writes:
> That's what I propose... Keeping a central database for typical spam
> words (a public database containing the SPAM-Corpus), and a private
> database containing the non-spam words occurances (non-spam corpus). The
> words probability database is kept separate on each computer...
> 
> Guess this would help.

The private database has to be separate for every user and protected
at least as well as the contents of the user's mailbox.  Otherwise the
spam filter becomes another Echelon or Carnivore, scanning private
user email for keywords and revealing them to third parties.



More information about the Python-list mailing list