[Spambayes] 2nd run with received lines

Brad Clements bkc@murkworks.com
Sun, 22 Sep 2002 11:50:15 -0400


I added this option:

[Tokenizer]
mine_received_headers: True

and did the run again.

I would like to see the change in stdev per run from one run to the next. I think reducing 
the stdev is also a win. I'll see if I can tweak rates.py and cmp.py to extract this info.

false positive percentages
    0.692  0.538  won    -22.25%
    0.385  0.308  won    -20.00%
    0.385  0.308  won    -20.00%
    0.692  0.538  won    -22.25%
    0.462  0.385  won    -16.67%
    0.154  0.154  tied
    0.462  0.462  tied
    0.385  0.231  won    -40.00%
    0.769  0.538  won    -30.04%
    0.462  0.385  won    -16.67%

won   8 times
tied  2 times
lost  0 times

total unique fp went from 63 to 50 won    -20.63%
mean fp % went from 0.484615384615 to 0.384615384615 won    -20.63%

false negative percentages
    1.692  1.615  won     -4.55%
    2.462  2.385  won     -3.13%
    2.154  2.000  won     -7.15%
    1.923  1.615  won    -16.02%
    1.846  1.769  won     -4.17%
    1.692  1.615  won     -4.55%
    1.538  1.385  won     -9.95%
    2.077  1.846  won    -11.12%
    2.077  2.000  won     -3.71%
    2.154  2.077  won     -3.57%

won  10 times
tied  0 times
lost  0 times

total unique fn went from 255 to 238 won     -6.67%
mean fn % went from 1.96153846154 to 1.83076923077 won     -6.67%



Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements