[spambayes-dev] Another incremental training idea...

T. Alexander Popiel popiel at wolfskeep.com
Tue Jan 13 18:09:26 EST 2004


In message:  <16388.29943.121292.675974 at montanaro.dyndns.org>
             Skip Montanaro <skip at pobox.com> writes:

>For some reason, my ham/spam ratio is getting out-of-whack faster that it
>seemed to in the past.  Another trick I'm experimenting with to keep things
>in closer balance is to rescore my spam mailbox and delete some of those
>which now score a rounded 1.00 (they didn't when I first scored them).
>There are probably all sorts of holes in that idea, but I figured I'd toss
>it out there for anyone interested.

This is related to behaviour of TOAE with expiry; when exipry
started to come into effect, the number of spams that got expired
out of the database was significantly higher than the number of new
spams getting trained... for about two weeks.  After that, things got
even worse.  Graphs at:
http://www.wolfskeep.com/~popiel/spambayes/plots/expire.html.

Look at the cumulative trained counts for nonedgeexpire vs. those for
plain TOAE at:
http://www.wolfskeep.com/~popiel/spambayes/nonedge.

I have not yet tried TOAE with balance maintenance.

- Alex



More information about the spambayes-dev mailing list