[spambayes-dev] lowest scoring message isn't always
"best"onetotrain on
Seth Goodman
nobody at spamcop.net
Tue Jan 20 14:46:27 EST 2004
[Skip Montanaro]
> I've started calculating the delta mean as well as the number of messages
> pushed into spam territory. Just eyeballing a plot of just over 100 pairs
> of (mean diff, # new spams) suggests there's a weak correlation
> between the
> two variables.
I was thinking about this, so glad you're playing with it. An additional
figure of merit for a message to train on might be a reduction in the SD of
the spam scores. This is almost as important as an increase in the mean (or
average, whichever you choose).
--
Seth Goodman
non-spam replies to sethg [at] GoodmanAssociates [dot] com
More information about the spambayes-dev
mailing list