[Spambayes] Database Format
Brent L Johnson
brent at bjohnson.net
Mon Feb 9 12:18:22 EST 2004
First off.. I've been using SpamBayes for a while. Since
CloudMark SpamNet went to being a pay service I switched
and used CloudMark's spam folder to teach SpamBayes.
It's working GREAT!
Now to my question.. I found in the FAQ where I can
locate the classification database. Is there a way
I can extract data from this DB? I'm working on
a Bayesian email classifier in Java (not to compete with
SpamBayes of course.. hey why mess with perfection..
hehe). It uses Classifier4J which learns by passing
in strings of text as spam or not spam. Im using
it on the server-side to pre-scan messages before
they hit Outlook.
Is there a way I can convert my SpamBayes database
to extract out the words considered spam?
I still have most of my spam sitting in my spam
folder so I could theoretically find a way to export
this from outlook.. but this is painfully slow
(Ive currently got almost 12,000 spam emails saved)
More information about the Spambayes