[spambayes-dev] Another incremental training idea...
Seth Goodman
nobody at spamcop.net
Tue Jan 13 19:01:18 EST 2004
> Seth> Another related idea is to dynamically move the edge thresholds
> Seth> until the training ratio averages 1:1.
>
[Skip Montanaro]
> I think you'll quickly wind up moving the ham edge threshold to
> 0.00 and the
> spam edge threshold would wind up very near to your spam cutoff.
> That's not
> necessarily a bad thing, but it has to be considered.
Good point. Given an unbalanced input mail flow, like most of us seem to
have, if you want to have 1:1 training, this is inevitable on one side or
the other (unless you want to use a random sampling method to select a
subset of nonedge spam to train - that scares me as well). We might as well
list it as another variation on nonedge: I suggest calling it one-edge. It
doesn't give me a particularly good feeling to train on all ham but only
non-edge spam, but maybe the 1:1 training ratio will allow it to perform
despite the unsatisfying way the balance is achieved?
--
Seth Goodman
Humans: off-list replies to sethg [at] GoodmanAssociates [dot] com
Spambots: disregard the above
More information about the spambayes-dev
mailing list