[Spambayes] incremental training strategies

T. Alexander Popiel popiel@wolfskeep.com
Mon Oct 28 17:28:55 2002


In message:  <15805.26237.16266.425547@montanaro.dyndns.org>
             Skip Montanaro <skip@pobox.com> writes:
>
>I am now running hammie.py from my procmailrc file, but not yet doing any
>filtering based on the results.  I trained it on my current setup (7000
>hams, 5000 spams).  Should I:
>
>    * train it on every message which passes through my inbox
>
>    * only train it on messages which it incorrectly classifies
>
>    * some other scheme
>
>?  Or is that not yet known?
>
>Skip

Speaking from a theoretical purity standpoint, I suspect that training
it on everything that came through would be 'cleaner'... but I have no
idea if in practise it would work any better than just training on the
mistakes and unsure.

Try out variations, and post results?

- Alex