[spambayes-dev] RE: [Spambayes] InBoxer/SpamAtBay beta available.
Skip Montanaro
skip at pobox.com
Tue Mar 16 17:34:59 EST 2004
Tony> What sort of thing is InBoxer doing to balance the database? It
Tony> is actually removing messages/tokens from the database in the
Tony> category that's too high, or finding additional messages for the
Tony> category that's too low, doing some wizzy math, or something else?
I'd be interested in this as well. In the train-to-exhaustion script I
currently force it to train on pairs of ham and spam. That means the
database is rigorously in-balance (except for the repeated training bit),
but that many spams I've saved are so far unused. I generally add hams to
the database which didn't score below 0.4 just to try and boost the number
of hams. I'd like to know a good way to selectively discard spams for this
endeavor.
Skip
More information about the spambayes-dev
mailing list