[spambayes-dev] lowest scoring message isn't always "best"onetotrain on

Seth Goodman nobody at spamcop.net
Tue Jan 20 14:46:27 EST 2004


[Skip Montanaro]
> I've started calculating the delta mean as well as the number of messages
> pushed into spam territory.  Just eyeballing a plot of just over 100 pairs
> of (mean diff, # new spams) suggests there's a weak correlation
> between the
> two variables.

I was thinking about this, so glad you're playing with it.  An additional
figure of merit for a message to train on might be a reduction in the SD of
the spam scores.  This is almost as important as an increase in the mean (or
average, whichever you choose).

--
Seth Goodman

non-spam replies to sethg [at] GoodmanAssociates [dot] com




More information about the spambayes-dev mailing list