[Spambayes] How many is enough?
T. Alexander Popiel
popiel at wolfskeep.com
Sun May 11 20:37:58 EDT 2003
In message: <5.2.0.9.0.20030512015036.02010fe0 at localhost>
Peter Bengtsson <mail at peterbe.com> writes:
>I've read the pages at http://spambayes.sourceforge.net/ now and concluded
>that you should train your database, but not too much.
>What I fail to find is some numbers for this. Are we talking about hundreds
>or thousands or millions?
>I've trained my database with 3000 ham and only 50 spam. That was basically
>all I had available in my email client at the moment.
>
>So, how much should I train before I run the risk of overdoing it?
I've not had any problems with training with tens of thousands
of messages (6345 ham, 17063 spam for last night's retrain).
The only reson I don't train with my full 44219 message archive
is to control database size... and that really is a pretty minor
concern. My incremental training tests showed that more was
better for accuracy...
- Alex
More information about the Spambayes
mailing list