[Spambayes] Outlook plugin - training

Charles Cazabon python-spambayes@discworld.dyndns.org
Wed Nov 6 20:16:09 2002


Tim Peters <tim.one@comcast.net> wrote:
> 
> It will also create a database size problem:  without a strategy for pruning
> useless words, the database will grow without bounds (an intuition that at a
> certain non-fantastic size, "all words" will have been seen is incorrect for
> computer-based indexing apps, and especially for email -- unique words keep
> appearing and keep bloating the beast).

Did you actually find this?  I found the growth tailed off dramatically after
not too long.  I no longer have the exact numbers, but database growth for me
tailed off almost to nothing after I had trained on something like 1500
messages.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                 <python-spambayes@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------



More information about the Spambayes mailing list