[Spambayes] expiration ideas.

Anthony Baxter anthony@interlink.com.au
Sun Oct 20 13:50:05 2002


Just thinking again about expiration, and wondering if the following 
would work:

  When training new data (say a new week's worth), train it with a 
  new classifier ("interim"). Once it's trained, merge the interim 
  classifier's wordinfo into your master classifier wordinfo by adding 
  the new spamcounts and hamcounts to the master wordinfo blob, then 
  recalc probabilities. 

  Keep the "interim" wordinfo around (gzipped, datestamped) until your
  expiration time is up - then undo the earlier merge, subtracting
  the spamcount/hamcounts. 

Thoughts? Unless there's a screamingly obvious "don't be stupid" I'll
play with this tomorrow (ah, leave....)

Anthony