Graham's spam filter
heikowu at ceosg.de
Thu Aug 22 21:00:24 CEST 2002
Am Don, 2002-08-22 um 19.48 schrieb Erik Max Francis:
> But this doesn't sound so appealing when one of the main features of
> Graham's method is that it can be specialized by each user over time.
> You'd start with some basic representation of typical good and bad
> emails, but over time the filters could come better. A client/server
> solution suggests a single, monolithic database, which doesn't extend
> well to this idea.
Well... I explicitly stated that it doesn't scale well for larger units
of people, but here where I live, we get our mail from the university
accounts, and get pretty much the same spam (as the mail addresses are
all of the form 4 letters, 4 digits, they are pretty well known out
The idea behind a central database for SPAM/non-SPAM is that users don't
have to spend much time at the beginning, training their system, as
other users have already taken care of most of the training.
This doesn't mean that the user can't install this program on his very
own personal computer, and (maybe) only look up tokens that he doesn't
have in his own database in the global database.
This would mean separating the training process to two separate
instances, a global database, and a personal database.
Hmm... Maybe sometime I might actually extend it to do that... *grin*
Universität 18 - Zimmer 2206 - Saarbrücken
More information about the Python-list