Graham's spam filter
phr-n2002b at NOSPAMnightsong.com
Fri Aug 23 00:54:04 CEST 2002
Heiko Wundram <heikowu at ceosg.de> writes:
> That's what I propose... Keeping a central database for typical spam
> words (a public database containing the SPAM-Corpus), and a private
> database containing the non-spam words occurances (non-spam corpus). The
> words probability database is kept separate on each computer...
> Guess this would help.
The private database has to be separate for every user and protected
at least as well as the contents of the user's mailbox. Otherwise the
spam filter becomes another Echelon or Carnivore, scanning private
user email for keywords and revealing them to third parties.
More information about the Python-list