[Spambayes] Comparing chi to zcombine
Tim Peters
tim.one@comcast.net
Mon, 14 Oct 2002 16:33:10 -0400
[Brad Clements]
> ...
> -> best cutoff for all runs: 0.985
> -> with weighted total 10*25 fp + 152 fn = 402
> -> fp rate 0.192% fn rate 1.17%
> saving ham histogram pickle to class_hamhist.pik
> saving spam histogram pickle to class_spamhist.pik
Note that a single cutoff value doesn't make sense for the "middle ground"
methods. Since you ran this, I checked in changes to histogram analysis
that compute "best" ham *and* spam cutoff points, where best minimizes a
function with three distinct costs (cost of an FP, cost of an FN, cost of an
"unsure" msg). You set those costs to what makes sense for your application
(e.g., as I've said many times, *I'd* rather get an fp than an fn for my own
use, as I'm going to review every rejection anyway, and I just want to
shuffle spam out of my main inbox so it doesn't interfere with normal
workflow; I may be unique in that, though).
I was able to run that analysis over the z-combining histogram you included
here, but it's impossible to guess what it would have said for your
chi-combining run:
-> best cost for Brad z-combining run: $301.40
-> per-fp cost $10.00; per-fn cost $1.00; per-unsure cost $0.20
-> achieved at ham & spam cutoffs 0.6 & 0.995
-> fp 23; fn 21; unsure ham 67; unsure spam 185
-> fp rate 0.177%; fn rate 0.162%; unsure rate 0.969%
.995 is the highest bucket there was, so it couldn't draw any finer
distinction among the 23 ham in in the .995 bucket. Boosting nbuckets would
allow a more exact analysis. OTOH, those fp are scoring so high they may be
hopeless. On the third hand,
-> <stat> Spam scores for all runs: 13000 items; mean 99.76; sdev 3.66
-> <stat> min 0; median 100; max 100
*at least* half your spam scored 100 under z-combining (because the median
spam score was 100), so there may well a useful distinction remaining to be
drawn within the .995 bucket.