[Spambayes] statistical comparison of enviroment?
T.A.Meyer at massey.ac.nz
Fri Mar 7 11:06:53 EST 2003
> Testing of new tokens like this has dropped off since about
> last October... spambayes is already good enough for just
> about everyone to be happy. My recent tests on training
> methods seem to show that accuracy has been dropping off for
> the last twho months, though, so it may be time to revisit
> this problem...
I'm (slowly) wading through the archives (interesting reading, but *long*), and have reached about this point. It does seem that the majority of the testing was done on certain collections of spam (along with lots of different ham). I wonder whether things got tuned a little too closely to that, and now that the spam is a little different, some options might need to be relooked at (rather than just retraining).
Once I'm done with the archives (and then the options stuff), I'll try and set up a testing system so that I can work on that. I'm personally most interested in the effects of aging, the ham:spam ratio (with the current code), and how long spambayes takes to become effective, so I'll concentrate on those.
More information about the Spambayes