[Spambayes] SpamBayes for 500.000 users
Skip Montanaro
skip at pobox.com
Tue Dec 16 17:00:26 EST 2003
Chris> But I can give you some first-hand knowledge from a much smaller
Chris> user base. I'm setting the same thing up for an office of 5
Chris> people, and here's the bare-bones fact; I need a separate
Chris> database for each user. I've tried using one database for
Chris> everyone, and it does work. But it only catches about 30-40
Chris> percent of spam. Not sure why this is the case, but it is
Chris> (unbalanced training?).
Does your shared database draw fairly equally on mail sent to all five
people? If not, you may find that some of the clues in the header will
"poison" your database. Tim discovered this effect in spades during early
testing. I believe one of the larger spam databases he used initially were
all sent to one person. The recipient-oriented clues related to that user
poisoned his tests.
Skip
More information about the Spambayes
mailing list