[spambayes-dev] Re: [Spambayes] Database cleaning?
Matthew Dixon Cowles
matt at mondoinfo.com
Sun Jun 1 14:48:19 EDT 2003
[Tim]
> The original spambayes code saved a time-of-last-access stamp in
> each WordInfo record. That was to support research into database
> cleaning strategies. The research never happened, though, and
> several WordInfo members got tossed to reduce the database size.
> If people want to start research on this again, an official patch
> set to maintain this kind of info in researchers' databases would
> be a real help.
I patched my classifier to record when a token is used in scoring at
the same time that I patched it to record the other statistics. My
thought is to have my classifier calculate several scores, some
ignoring tokens that haven't been used in scoring for a while. I
haven't gotten to that part but if anyone is interested in the
(trivial) changes so far, I'd be glad to upload the patch to
SourceForge.
Regards,
Matt
More information about the spambayes-dev
mailing list