[spambayes-dev] train to exhaustion?

Skip Montanaro skip at pobox.com
Fri Feb 13 12:12:36 EST 2004

    Kenny> In most cases that's probably true.  However, here's an example
    Kenny> from one of the test runs of tte.py that Tony posted to his web
    Kenny> site:

    Kenny> round:  3, msgs: 1312, ham misses:   2, spam misses:   0
    Kenny> round:  4, msgs: 1312, ham misses:   0, spam misses:   2
    Kenny> round:  5, msgs: 1312, ham misses:   0, spam misses:   1
    Kenny> round:  6, msgs: 1312, ham misses:   0, spam misses:   0

    Kenny> The total number of misses did not decrease between rounds 3 and
    Kenny> 4, but further rounds did reduce the misses to zero.

Understood.  I'm sure there are ways around that, like save the total misses
from the last N rounds and exit if they increase or don't decrease within M
rounds (M < N).

    Kenny> If nothing else, it fails to account for Tony's original
    Kenny> question: "if one message was still a false-positive, but moved
    Kenny> from 0.8 to 0.7, is that improving?"

That's not how I interpreted the description on Gary's blog.  Either it
moves into the desired zone or it doesn't.

I've been using my tte.py script for a few days now and haven't noticed this
as a practical problem.  I suspect we're worrying about a problem that won't
arise.  Maybe add a maxrounds flag?


More information about the spambayes-dev mailing list