[Spambayes] Need testers!
Tim Peters
tim.one@comcast.net
Sun, 15 Sep 2002 03:45:47 -0400
[Tim]
> 1. Run a baseline and save a summary file (rates.py).
>
> 2. Make exactly one change, adding
>
> [Classifier]
> adjust_probs_by_evidence_mass: True
>
> to your bayescustomize.ini file.
>
> 3. Run the same test scenario again, and create another summary file.
>
> 4. Run cmp.py over the summary files and post the cmp.py output
> (all of it, please). Or mail it to me, but I think there's value
> in public humiliation <wink> if it helps get better results.
This worked well for my very large training sets, but was a clear loss when
I ran on smaller subsets. I've checked in another version that repaired
this for me. To try it, do cvs up and change step #2 to this:
[Classifier]
adjust_probs_by_evidence_mass: True
min_spamprob: 0.001
max_spamprob: 0.999
hambias: 1.5
On my giant
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
10-fold c-v run, this is the difference:
false positive percentages
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
0.050 0.050 tied
0.000 0.050 lost +(was 0)
0.000 0.000 tied
0.050 0.050 tied
0.000 0.000 tied
0.100 0.050 won -50.00%
won 1 times
tied 8 times
lost 1 times
total unique fp went from 4 to 4 tied
mean fp % went from 0.02 to 0.02 tied
false negative percentages
0.218 0.145 won -33.49%
0.364 0.364 tied
0.000 0.073 lost +(was 0)
0.218 0.218 tied
0.218 0.218 tied
0.291 0.145 won -50.17%
0.218 0.073 won -66.51%
0.145 0.145 tied
0.291 0.218 won -25.09%
0.073 0.000 won -100.00%
won 5 times
tied 4 times
lost 1 times
total unique fn went from 28 to 22 won -21.43%
mean fn % went from 0.203636363636 to 0.16 won -21.43%
On much smaller random-subset 10-fold c-v runs with
-> <stat> tested 300 hams & 300 spams against 2700 hams & 2700 spams
the effect is usually a significant decrease in the f-n rate, and a similar
(to the huge test) random bump up or down in the f-p stats. Dropping the
value of hambias probably accounts for the f-n goodness; the rest is largely
to prevent f-p badness at the same time; expanding the min/max-prob range,
coupled with taking into account the number of msgs that go into each
probablity estimate, all but eliminates "massive cancellation" of
MIN_SPAMPROB and MAX_SPAMPROB clues.