Graham's spam filter
rnd at onego.ru
Fri Aug 23 05:50:30 CEST 2002
On 22 Aug 2002, Paul Rubin wrote:
>Heiko Wundram <heikowu at ceosg.de> writes:
>> That's what I propose... Keeping a central database for typical spam
>> words (a public database containing the SPAM-Corpus), and a private
>> database containing the non-spam words occurances (non-spam corpus). The
>> words probability database is kept separate on each computer...
>> Guess this would help.
>The private database has to be separate for every user and protected
>at least as well as the contents of the user's mailbox. Otherwise the
>spam filter becomes another Echelon or Carnivore, scanning private
>user email for keywords and revealing them to third parties.
Words could be hashed before put into private database.
Sincerely yours, Roman Suzi
rnd at onego.ru =\= My AI powered by Linux RedHat 7.2
More information about the Python-list