Graham's spam filter

Roman Suzi rnd at
Fri Aug 23 05:50:30 CEST 2002

On 22 Aug 2002, Paul Rubin wrote:

>Heiko Wundram <heikowu at> writes:
>> That's what I propose... Keeping a central database for typical spam
>> words (a public database containing the SPAM-Corpus), and a private
>> database containing the non-spam words occurances (non-spam corpus). The
>> words probability database is kept separate on each computer...
>> Guess this would help.
>The private database has to be separate for every user and protected
>at least as well as the contents of the user's mailbox.  Otherwise the
>spam filter becomes another Echelon or Carnivore, scanning private
>user email for keywords and revealing them to third parties.

Words could be hashed before put into private database.

Sincerely yours, Roman Suzi
rnd at =\= My AI powered by Linux RedHat 7.2

