[Spambayes] Request for a trained spam filter
skip at pobox.com
skip at pobox.com
Mon Nov 20 16:58:40 CET 2006
Sudheendra> I'm a Masters student at the Dept of CS at UT, Austin. I'm
Sudheendra> doing a project related to spam generation and I need a well
Sudheendra> trained spam filter for the same. I downloaded the training
Sudheendra> set available online and trained the spambayes filter using
Sudheendra> it. But the accuracy that I got on a set of new spam
Sudheendra> messages was not too great. Is there some way of improving
Sudheendra> the accuracy?
The qualitative nature of spam changes frequently. If you trained SpamBayes
on a data set that was generated more than a couple months ago that dataset
isn't going to reflect many of the characteristics found in spam today.
What ham did you use?
Sudheendra> Since my project involves substantial experimentation I do
Sudheendra> not want to spend too much time in training the filter. So
Sudheendra> is it possible to get a trained spam filter from you?
Not really. A trained filter is created using a mix of good and bad email.
For obvious privacy reasons very few people are willing to expose
information about the email they receive.
Start training from scratch and train on the mistakes SpamBayes makes on the
mail you receive. After a handful of emails it should do a pretty good job
properly filtering your email. Also, use the latest version (best to check
out what's in the CVS repository) and use this page on the SpamBayes wiki
http://www.entrian.com/sbwiki/TryOutThePreRelease
to guide your installation and setup.
Skip
More information about the SpamBayes
mailing list