Hi, I'm a Masters student at the Dept of CS at UT, Austin. I'm doing a project related to spam generation and I need a well trained spam filter for the same. I downloaded the training set available online and trained the spambayes filter using it. But the accuracy that I got on a set of new spam messages was not too great. Is there some way of improving the accuracy? Since my project involves substantial experimentation I do not want to spend too much time in training the filter. So is it possible to get a trained spam filter from you? I'd really appreciate any help from you in this regard. Thanks, Sudheendra
Sudheendra> I'm a Masters student at the Dept of CS at UT, Austin. I'm Sudheendra> doing a project related to spam generation and I need a well Sudheendra> trained spam filter for the same. I downloaded the training Sudheendra> set available online and trained the spambayes filter using Sudheendra> it. But the accuracy that I got on a set of new spam Sudheendra> messages was not too great. Is there some way of improving Sudheendra> the accuracy? The qualitative nature of spam changes frequently. If you trained SpamBayes on a data set that was generated more than a couple months ago that dataset isn't going to reflect many of the characteristics found in spam today. What ham did you use? Sudheendra> Since my project involves substantial experimentation I do Sudheendra> not want to spend too much time in training the filter. So Sudheendra> is it possible to get a trained spam filter from you? Not really. A trained filter is created using a mix of good and bad email. For obvious privacy reasons very few people are willing to expose information about the email they receive. Start training from scratch and train on the mistakes SpamBayes makes on the mail you receive. After a handful of emails it should do a pretty good job properly filtering your email. Also, use the latest version (best to check out what's in the CVS repository) and use this page on the SpamBayes wiki http://www.entrian.com/sbwiki/TryOutThePreRelease to guide your installation and setup. Skip
participants (2)
-
skip@pobox.com -
svnaras@imap.cs.utexas.edu