[Spambayes] Request for a trained spam filter

skip at pobox.com skip at pobox.com
Mon Nov 20 16:58:40 CET 2006


    Sudheendra> I'm a Masters student at the Dept of CS at UT, Austin. I'm
    Sudheendra> doing a project related to spam generation and I need a well
    Sudheendra> trained spam filter for the same.  I downloaded the training
    Sudheendra> set available online and trained the spambayes filter using
    Sudheendra> it. But the accuracy that I got on a set of new spam
    Sudheendra> messages was not too great.  Is there some way of improving
    Sudheendra> the accuracy?

The qualitative nature of spam changes frequently.  If you trained SpamBayes
on a data set that was generated more than a couple months ago that dataset
isn't going to reflect many of the characteristics found in spam today.
What ham did you use?

    Sudheendra> Since my project involves substantial experimentation I do
    Sudheendra> not want to spend too much time in training the filter. So
    Sudheendra> is it possible to get a trained spam filter from you?

Not really.  A trained filter is created using a mix of good and bad email.
For obvious privacy reasons very few people are willing to expose
information about the email they receive.

Start training from scratch and train on the mistakes SpamBayes makes on the
mail you receive.  After a handful of emails it should do a pretty good job
properly filtering your email.  Also, use the latest version (best to check
out what's in the CVS repository) and use this page on the SpamBayes wiki

    http://www.entrian.com/sbwiki/TryOutThePreRelease

to guide your installation and setup.

Skip


More information about the SpamBayes mailing list