[Spambayes] timcombine comparison with varied cutoff

Brad Clements bkc@murkworks.com
Thu, 10 Oct 2002 10:57:46 -0400


I re-ran the comparison between use_tim_combine false --> true

this time, I set the spam_cutoff to the recommended value for the false (0.53) and true 
(0.57) case.

ran rates and cmp.

results/timcombinefalse053s.txt -> results/timcombinetrue057s.txt
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams
-> <stat> tested 1300 hams & 1300 spams against 11700 hams & 11700 spams

false positive percentages
    0.385  0.231  won    -40.00%
    0.462  0.462  tied          
    0.385  0.231  won    -40.00%
    0.615  0.538  won    -12.52%
    0.462  0.231  won    -50.00%
    0.231  0.231  tied          
    0.154  0.154  tied          
    0.154  0.154  tied          
    0.615  0.538  won    -12.52%
    0.385  0.308  won    -20.00%

won   6 times
tied  4 times
lost  0 times

total unique fp went from 50 to 40 won    -20.00%
mean fp % went from 0.384615384615 to 0.307692307692 won    -20.00%

false negative percentages
    0.308  0.385  lost   +25.00%
    0.385  0.385  tied          
    0.385  0.385  tied          
    0.385  0.462  lost   +20.00%
    0.231  0.231  tied          
    0.308  0.385  lost   +25.00%
    0.385  0.385  tied          
    0.308  0.385  lost   +25.00%
    0.308  0.538  lost   +74.68%
    0.308  0.385  lost   +25.00%

won   0 times
tied  4 times
lost  6 times

total unique fn went from 43 to 51 lost   +18.60%
mean fn % went from 0.330769230769 to 0.392307692307 lost   +18.60%

ham mean                     ham sdev
  25.47   12.23  -51.98%        7.31    9.02  +23.39%
  25.37   12.04  -52.54%        7.07    8.57  +21.22%
  25.56   12.08  -52.74%        6.96    8.44  +21.26%
  25.57   12.21  -52.25%        7.09    8.65  +22.00%
  25.33   11.98  -52.70%        6.94    8.40  +21.04%
  25.56   12.20  -52.27%        6.77    8.16  +20.53%
  25.29   11.69  -53.78%        6.71    7.80  +16.24%
  25.19   11.61  -53.91%        6.71    7.91  +17.88%
  25.07   11.63  -53.61%        7.02    8.31  +18.38%
  25.14   11.60  -53.86%        6.88    7.94  +15.41%

ham mean and sdev for all runs
  25.35   11.93  -52.94%        6.95    8.33  +19.86%

spam mean                    spam sdev
  80.93   90.31  +11.59%        7.72    7.59   -1.68%
  81.17   90.59  +11.61%        7.73    7.68   -0.65%
  81.36   90.72  +11.50%        7.52    7.40   -1.60%
  81.51   90.91  +11.53%        7.40    7.16   -3.24%
  81.02   90.54  +11.75%        7.19    6.93   -3.62%
  81.26   90.68  +11.59%        7.41    7.23   -2.43%
  81.03   90.49  +11.67%        7.52    7.25   -3.59%
  81.08   90.61  +11.75%        7.48    7.29   -2.54%
  81.47   90.93  +11.61%        7.54    7.21   -4.38%
  80.93   90.40  +11.70%        7.95    7.80   -1.89%

spam mean and sdev for all runs
  81.18   90.62  +11.63%        7.55    7.36   -2.52%

ham/spam mean difference: 55.83 78.69 +22.86


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements