Graham's spam filter

Paul Rubin phr-n2002b at
Fri Aug 23 00:54:04 CEST 2002

Heiko Wundram <heikowu at> writes:
> That's what I propose... Keeping a central database for typical spam
> words (a public database containing the SPAM-Corpus), and a private
> database containing the non-spam words occurances (non-spam corpus). The
> words probability database is kept separate on each computer...
> Guess this would help.

The private database has to be separate for every user and protected
at least as well as the contents of the user's mailbox.  Otherwise the
spam filter becomes another Echelon or Carnivore, scanning private
user email for keywords and revealing them to third parties.

