[spambayes-dev] Suggested Feature?

Jeff Epler jepler at unpythonic.net
Fri Sep 17 04:04:57 CEST 2004

At first blush, this should be "easy" if you write the "central
repository" as a new database back-end for spambayes, with some kind of
delayed update system.

However, I suspect this setup won't be great for many users.  A very
obvious example is that To: and From: headers are used by Spambayes as
token sources, and the spammy and hammy values are very individual. (In
fact, because I get much more list traffic than legitimate personal
e-mail, messages addressed to me are spammy, on average)

Not everyone gets the same kinds of messages, either.  For instance,
many folks would get lots of legitimate messages about "wedding", while
that's probably a spam clue for me.

Then there's the need to keep "the spammers" from submitting messages
in order to break the system, and the need to preserve users' privacy.  
Lots of websites seem to have a facility to send forgotten passwords in
e-mail, which would make those words show up in the shared database.
Do I really want to send token lists that come from all the love letters
I get in e-mail out to some third party?

Finally, why would I trust/want to use a service like this, when I built
a very good spambayes database with only a modest amount of training
effort?  Larger databases are not clearly better than smaller databases,
and you're basically proposing the largest possible database...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/spambayes-dev/attachments/20040916/06e090f6/attachment.pgp

More information about the spambayes-dev mailing list