[Spambayes] Perhaps a level header would be useful?
tim.one at comcast.net
Tue Mar 11 23:24:49 EST 2003
> IIRC, cmp.py never got updated to deal sensibly with
> unsures. If that's right, it shouldn't be used except when spam_cutoff
> == ham_cutoff. Then you've got a two-outcome classifier (no unsures),
> and cmp.py won't "forget" any msgs.
> I think this is still the case. If there is going to be a minor
> increase in testing again, which is the better option, to have
> ham_cutoff==spam_cutoff, or to update to reveal unsure info? (I
> suspect the latter).
It depends on what you're trying to accomplish, of course <wink>. Updating
cmp.py is a project, because it never intended to deal with unsures, and
they don't fit well with its very detailed analysis of FP and FN. Note that
the less-exhaustive table.py *does* deal with unsures already, and with
automating cutoff analysis (based on your histogram option settings). After
Alex invented table.py, I rarely used cmp.py again except to zero in on
changes with very small effects. Using table.py, you can skip the rates.py
step(s) too (table.py works directly with the output files produced by
timtest.py (if you must) or timcv.py (preferred)).
> Thanks again. [Must think more before posting. Must think more before
> posting. Must think...]
You're doing fine! Thinking is overrated <wink>, and if I can't remember
why we did something one way instead of another, we should probably throw it
out and start that part over again.
More information about the Spambayes