[Spambayes] incremental training strategies
T. Alexander Popiel
popiel@wolfskeep.com
Mon Oct 28 17:28:55 2002
In message: <15805.26237.16266.425547@montanaro.dyndns.org>
Skip Montanaro <skip@pobox.com> writes:
>
>I am now running hammie.py from my procmailrc file, but not yet doing any
>filtering based on the results. I trained it on my current setup (7000
>hams, 5000 spams). Should I:
>
> * train it on every message which passes through my inbox
>
> * only train it on messages which it incorrectly classifies
>
> * some other scheme
>
>? Or is that not yet known?
>
>Skip
Speaking from a theoretical purity standpoint, I suspect that training
it on everything that came through would be 'cleaner'... but I have no
idea if in practise it would work any better than just training on the
mistakes and unsure.
Try out variations, and post results?
- Alex