[Spambayes] expiration ideas.
Anthony Baxter
anthony@interlink.com.au
Sun Oct 20 13:50:05 2002
Just thinking again about expiration, and wondering if the following
would work:
When training new data (say a new week's worth), train it with a
new classifier ("interim"). Once it's trained, merge the interim
classifier's wordinfo into your master classifier wordinfo by adding
the new spamcounts and hamcounts to the master wordinfo blob, then
recalc probabilities.
Keep the "interim" wordinfo around (gzipped, datestamped) until your
expiration time is up - then undo the earlier merge, subtracting
the spamcount/hamcounts.
Thoughts? Unless there's a screamingly obvious "don't be stupid" I'll
play with this tomorrow (ah, leave....)
Anthony