[Spambayes] Backup daqtabase

Tim Peters tim.peters at gmail.com
Tue Oct 11 19:15:17 CEST 2005


[Jesse Pelton]
>  ...
> Developers: would it be feasible and sensible to add UI to allow users to
> remove messages older than a user-specified cutoff? If so, I'll log a
> feature request.

The database doesn't hold training messages, it only contains
statistics computed from the union of tokens seen across all training
messages.  To support removing old messages from the training data
would require additional database work, a mapping from some sort of
message identifier to a list of all tokens that were seen in that
message, so that those _tokens_ could be removed from the statistics
later.  Note that many options change the exact tokens extracted from
a message, so it would not be enough just to save the original message
(there's no guarantee the same collection of tokens could be extracted
from it later).

That would be a fair amount of work, another pile of messy UI issues,
and would need a larger database.

FWIW, I routinely throw away my database and start over from scratch
too.  Watching it improve is fun :-)!


More information about the SpamBayes mailing list