[Spambayes] chi-squared versus "prob strength"
Anthony Baxter
anthony@interlink.com.au
Mon, 14 Oct 2002 16:31:31 +1000
>>> Tim Peters wrote
> I was, but more importantly my test data agreed, so I'm going to switch to
> this (the evidence is so consistent and solid on both our datasets that
> making it an option would supply a pointless choice -- losers are killed).
> Good show!
Here's what my mungo-test set shows for this (before is pre-Rob Hooft's
change, after is current CVS)
chi2s.txt -> chi2as.txt
-> <stat> tested 3490 hams & 1687 spams against 31410 hams & 15161 spams
-> <stat> tested 3490 hams & 1682 spams against 31410 hams & 15166 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1679 spams against 31410 hams & 15169 spams
-> <stat> tested 3490 hams & 1686 spams against 31410 hams & 15162 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1678 spams against 31410 hams & 15170 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1683 spams against 31410 hams & 15165 spams
-> <stat> tested 3490 hams & 1689 spams against 31410 hams & 15159 spams
-> <stat> tested 3490 hams & 1687 spams against 31410 hams & 15161 spams
-> <stat> tested 3490 hams & 1682 spams against 31410 hams & 15166 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1679 spams against 31410 hams & 15169 spams
-> <stat> tested 3490 hams & 1686 spams against 31410 hams & 15162 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1678 spams against 31410 hams & 15170 spams
-> <stat> tested 3490 hams & 1688 spams against 31410 hams & 15160 spams
-> <stat> tested 3490 hams & 1683 spams against 31410 hams & 15165 spams
-> <stat> tested 3490 hams & 1689 spams against 31410 hams & 15159 spams
false positive percentages
0.946 0.974 lost +2.96%
0.917 0.917 tied
0.802 0.831 lost +3.62%
0.659 0.860 lost +30.50%
0.573 0.659 lost +15.01%
0.802 0.831 lost +3.62%
0.716 0.745 lost +4.05%
0.516 0.544 lost +5.43%
0.630 0.688 lost +9.21%
0.917 1.003 lost +9.38%
won 0 times
tied 1 times
lost 9 times
total unique fp went from 261 to 281 lost +7.66%
mean fp % went from 0.747851002865 to 0.805157593123 lost +7.66%
false negative percentages
0.356 0.296 won -16.85%
0.119 0.059 won -50.42%
0.237 0.237 tied
0.476 0.476 tied
0.297 0.237 won -20.20%
0.415 0.415 tied
0.596 0.477 won -19.97%
0.296 0.237 won -19.93%
0.416 0.416 tied
0.355 0.296 won -16.62%
won 6 times
tied 4 times
lost 0 times
total unique fn went from 60 to 53 won -11.67%
mean fn % went from 0.356257958499 to 0.314689990048 won -11.67%
ham mean ham sdev
3.46 3.24 -6.36% 12.12 11.96 -1.32%
3.01 2.85 -5.32% 11.48 11.39 -0.78%
3.28 3.01 -8.23% 11.45 11.22 -2.01%
3.23 3.02 -6.50% 11.43 11.27 -1.40%
3.15 2.88 -8.57% 10.65 10.37 -2.63%
3.17 2.95 -6.94% 11.30 11.07 -2.04%
3.27 3.02 -7.65% 11.29 10.94 -3.10%
3.06 2.82 -7.84% 10.51 10.20 -2.95%
3.32 3.13 -5.72% 11.37 11.18 -1.67%
3.45 3.21 -6.96% 11.75 11.59 -1.36%
ham mean and sdev for all runs
3.24 3.01 -7.10% 11.34 11.13 -1.85%
spam mean spam sdev
99.75 99.76 +0.01% 3.91 3.85 -1.53%
99.90 99.91 +0.01% 1.62 1.38 -14.81%
99.81 99.82 +0.01% 3.09 3.05 -1.29%
99.60 99.62 +0.02% 4.92 4.80 -2.44%
99.78 99.78 +0.00% 3.24 3.36 +3.70%
99.78 99.78 +0.00% 3.04 3.14 +3.29%
99.62 99.62 +0.00% 4.73 4.78 +1.06%
99.79 99.81 +0.02% 2.75 2.66 -3.27%
99.66 99.66 +0.00% 4.47 4.62 +3.36%
99.70 99.70 +0.00% 4.37 4.32 -1.14%
spam mean and sdev for all runs
99.74 99.75 +0.01% 3.75 3.75 +0.00%
ham/spam mean difference: 96.50 96.74 +0.24
Here's the histograms from the 'after' case:
-> <stat> Ham scores for all runs: 34900 items; mean 3.01; sdev 11.13
-> <stat> min -9.99201e-14; median 0.000498415; max 100
* = 448 items
0.0 27319 *************************************************************
0.5 1129 ***
1.0 695 **
1.5 507 **
2.0 412 *
2.5 320 *
3.0 269 *
3.5 241 *
4.0 194 *
4.5 178 *
5.0 151 *
5.5 114 *
6.0 131 *
6.5 129 *
7.0 106 *
7.5 104 *
8.0 103 *
8.5 84 *
9.0 76 *
9.5 85 *
10.0 65 *
10.5 60 *
11.0 73 *
11.5 54 *
12.0 63 *
12.5 50 *
13.0 59 *
13.5 51 *
14.0 65 *
14.5 43 *
15.0 31 *
15.5 50 *
16.0 40 *
16.5 38 *
17.0 39 *
17.5 37 *
18.0 27 *
18.5 31 *
19.0 40 *
19.5 31 *
20.0 41 *
20.5 27 *
21.0 27 *
21.5 29 *
22.0 26 *
22.5 34 *
23.0 23 *
23.5 26 *
24.0 31 *
24.5 23 *
25.0 12 *
25.5 15 *
26.0 16 *
26.5 27 *
27.0 27 *
27.5 27 *
28.0 18 *
28.5 25 *
29.0 16 *
29.5 19 *
30.0 19 *
30.5 17 *
31.0 14 *
31.5 18 *
32.0 16 *
32.5 12 *
33.0 29 *
33.5 19 *
34.0 6 *
34.5 15 *
35.0 14 *
35.5 15 *
36.0 19 *
36.5 11 *
37.0 9 *
37.5 12 *
38.0 13 *
38.5 10 *
39.0 12 *
39.5 15 *
40.0 13 *
40.5 12 *
41.0 9 *
41.5 14 *
42.0 14 *
42.5 13 *
43.0 21 *
43.5 16 *
44.0 11 *
44.5 7 *
45.0 10 *
45.5 8 *
46.0 9 *
46.5 10 *
47.0 9 *
47.5 9 *
48.0 9 *
48.5 10 *
49.0 12 *
49.5 20 *
50.0 31 *
50.5 8 *
51.0 12 *
51.5 6 *
52.0 10 *
52.5 8 *
53.0 10 *
53.5 3 *
54.0 9 *
54.5 5 *
55.0 16 *
55.5 14 *
56.0 6 *
56.5 7 *
57.0 10 *
57.5 8 *
58.0 6 *
58.5 7 *
59.0 11 *
59.5 3 *
60.0 5 *
60.5 9 *
61.0 3 *
61.5 5 *
62.0 5 *
62.5 5 *
63.0 5 *
63.5 9 *
64.0 10 *
64.5 8 *
65.0 5 *
65.5 7 *
66.0 7 *
66.5 3 *
67.0 3 *
67.5 5 *
68.0 7 *
68.5 3 *
69.0 5 *
69.5 6 *
70.0 6 *
70.5 3 *
71.0 2 *
71.5 5 *
72.0 5 *
72.5 1 *
73.0 1 *
73.5 6 *
74.0 2 *
74.5 8 *
75.0 5 *
75.5 5 *
76.0 5 *
76.5 7 *
77.0 5 *
77.5 3 *
78.0 4 *
78.5 4 *
79.0 2 *
79.5 2 *
80.0 2 *
80.5 4 *
81.0 7 *
81.5 4 *
82.0 6 *
82.5 5 *
83.0 1 *
83.5 5 *
84.0 4 *
84.5 2 *
85.0 4 *
85.5 4 *
86.0 2 *
86.5 1 *
87.0 8 *
87.5 6 *
88.0 3 *
88.5 5 *
89.0 2 *
89.5 3 *
90.0 0
90.5 0
91.0 1 *
91.5 3 *
92.0 1 *
92.5 3 *
93.0 5 *
93.5 5 *
94.0 5 *
94.5 3 *
95.0 8 *
95.5 4 *
96.0 1 *
96.5 3 *
97.0 5 *
97.5 4 *
98.0 5 *
98.5 8 *
99.0 8 *
99.5 50 *
-> <stat> Spam scores for all runs: 16848 items; mean 99.75; sdev 3.75
-> <stat> min 0.00333927; median 100; max 100
* = 273 items
0.0 1 *
0.5 1 *
1.0 1 *
1.5 0
2.0 0
2.5 1 *
3.0 1 *
3.5 0
4.0 0
4.5 0
5.0 1 *
5.5 0
6.0 0
6.5 0
7.0 0
7.5 0
8.0 1 *
8.5 0
9.0 1 *
9.5 0
10.0 0
10.5 0
11.0 0
11.5 0
12.0 1 *
12.5 0
13.0 0
13.5 0
14.0 2 *
14.5 0
15.0 0
15.5 0
16.0 0
16.5 0
17.0 1 *
17.5 2 *
18.0 0
18.5 0
19.0 0
19.5 0
20.0 0
20.5 0
21.0 1 *
21.5 1 *
22.0 0
22.5 0
23.0 0
23.5 0
24.0 0
24.5 2 *
25.0 0
25.5 1 *
26.0 1 *
26.5 0
27.0 0
27.5 0
28.0 0
28.5 0
29.0 0
29.5 0
30.0 0
30.5 0
31.0 0
31.5 0
32.0 0
32.5 0
33.0 0
33.5 0
34.0 0
34.5 0
35.0 0
35.5 0
36.0 0
36.5 0
37.0 0
37.5 1 *
38.0 1 *
38.5 0
39.0 0
39.5 0
40.0 0
40.5 0
41.0 0
41.5 1 *
42.0 0
42.5 1 *
43.0 0
43.5 1 *
44.0 0
44.5 0
45.0 0
45.5 0
46.0 0
46.5 1 *
47.0 0
47.5 0
48.0 0
48.5 1 *
49.0 0
49.5 2 *
50.0 3 *
50.5 2 *
51.0 0
51.5 2 *
52.0 0
52.5 1 *
53.0 0
53.5 0
54.0 1 *
54.5 0
55.0 0
55.5 0
56.0 0
56.5 2 *
57.0 1 *
57.5 1 *
58.0 0
58.5 1 *
59.0 0
59.5 0
60.0 1 *
60.5 0
61.0 0
61.5 1 *
62.0 0
62.5 0
63.0 0
63.5 1 *
64.0 0
64.5 1 *
65.0 0
65.5 2 *
66.0 1 *
66.5 0
67.0 0
67.5 0
68.0 1 *
68.5 0
69.0 1 *
69.5 1 *
70.0 0
70.5 1 *
71.0 0
71.5 1 *
72.0 0
72.5 3 *
73.0 0
73.5 0
74.0 1 *
74.5 0
75.0 1 *
75.5 1 *
76.0 0
76.5 1 *
77.0 1 *
77.5 0
78.0 6 *
78.5 0
79.0 1 *
79.5 0
80.0 1 *
80.5 0
81.0 1 *
81.5 1 *
82.0 1 *
82.5 1 *
83.0 0
83.5 0
84.0 0
84.5 0
85.0 3 *
85.5 1 *
86.0 2 *
86.5 2 *
87.0 2 *
87.5 0
88.0 2 *
88.5 0
89.0 0
89.5 2 *
90.0 0
90.5 1 *
91.0 0
91.5 0
92.0 4 *
92.5 5 *
93.0 2 *
93.5 1 *
94.0 2 *
94.5 4 *
95.0 2 *
95.5 8 *
96.0 3 *
96.5 5 *
97.0 9 *
97.5 10 *
98.0 9 *
98.5 22 *
99.0 44 *
99.5 16628 *************************************************************
-> best cutoff for all runs: 0.995
-> with weighted total 10*50 fp + 220 fn = 720
-> fp rate 0.143% fn rate 1.31%