[spambayes-dev] train to exhaustion?
skip at pobox.com
Fri Feb 13 12:12:36 EST 2004
Kenny> In most cases that's probably true. However, here's an example
Kenny> from one of the test runs of tte.py that Tony posted to his web
Kenny> round: 3, msgs: 1312, ham misses: 2, spam misses: 0
Kenny> round: 4, msgs: 1312, ham misses: 0, spam misses: 2
Kenny> round: 5, msgs: 1312, ham misses: 0, spam misses: 1
Kenny> round: 6, msgs: 1312, ham misses: 0, spam misses: 0
Kenny> The total number of misses did not decrease between rounds 3 and
Kenny> 4, but further rounds did reduce the misses to zero.
Understood. I'm sure there are ways around that, like save the total misses
from the last N rounds and exit if they increase or don't decrease within M
rounds (M < N).
Kenny> If nothing else, it fails to account for Tony's original
Kenny> question: "if one message was still a false-positive, but moved
Kenny> from 0.8 to 0.7, is that improving?"
That's not how I interpreted the description on Gary's blog. Either it
moves into the desired zone or it doesn't.
I've been using my tte.py script for a few days now and haven't noticed this
as a practical problem. I suspect we're worrying about a problem that won't
arise. Maybe add a maxrounds flag?
More information about the spambayes-dev