[Spambayes] POP3 Server Performance Issue Win2K SP4

W. Eliot Kimber eliot at isogen.com
Mon Dec 8 17:34:28 EST 2003


Kenny Pitt wrote:

> W. Eliot Kimber wrote:
> 
>>Kenny Pitt wrote:
>>
>>>It usually isn't necessary to train on every message you receive,
>>>and if you've received 25,000 spams in about 3 months then I suspect
>>>your training is heavily over-balanced toward spam anyway.  I would
>>>suggest retraining on a much smaller set of messages (50-100 of each
>>>is probably more than sufficient).  After that, be more selective
>>>about which messages you actually train on.  If most of your
>>>messages classify correctly then you shouldn't need to train much at
>>>all. 
>>
>>I'm feeling a bit slow--but what is the process for doing this sort of
>>re-training? Do I simply delete hammie.db and then retrain again using
>>either new messages or old messages that I think are representative?
>>
>>I couldn't find any docs that spoke to this process directly.

> Yes, you can delete your database and start over.  You should probably
> delete both your statistics_database and your message_info_database just
> to make sure they stay in sync.  You can then retrain using a small,
> representative subset of messages.

This is what I did and it appears to have fixed the issue.

I think I realize what I did: I think I trained on my collected spam 
folder, which probably had about 10K spams in it at the time. I guess 
the answer is: don't do that.

Thanks,

Eliot
-- 
W. Eliot Kimber
Innodata Isogen
eliot at isogen.com
www.isogen.com




More information about the Spambayes mailing list