[Spambayes] train on error - to exhaustion?

David Relson relson at osagesoftware.com
Tue Dec 3 16:57:58 2002


At 11:27 AM 12/3/02, Greg Louis wrote:

>Doesn't look as though pure training-on-error is particularly
>advantageous with the Robinson-Fisher (chi) calculation method.  It may
>still be useful in maintaining the effectiveness of an established
>training base.

Greg,

That makes sense.  By definition, with training-on-error, only some of the 
training corpora are put into the word lists.  The obvious result is 
smaller word lists.  Other than list size, the effects are less clear.  On 
the one hand, incoming messages will have fewer "hits" in the word lists; 
while on the other hand, the hits will be more "meaningful".  With the 
smaller lists, there is less "breadth of knowledge" about spam and 
ham.  This could account for the lack of advantage of training-on-error.

David




More information about the Spambayes mailing list