[Spambayes] On counting words more than once

Guido van Rossum guido@python.org
Sun, 29 Sep 2002 10:37:41 -0400


It's a win for me too (total 2000 ham, 2000 spam):

false positive percentages
    0.000  0.000  tied          
    0.500  0.500  tied          
    0.000  0.000  tied          
    0.500  0.000  won   -100.00%
    1.000  0.500  won    -50.00%
    0.000  0.000  tied          
    1.000  0.500  won    -50.00%
    0.500  0.500  tied          
    0.000  0.000  tied          
    0.500  0.500  tied          

won   3 times
tied  7 times
lost  0 times

total unique fp went from 8 to 5 won    -37.50%
mean fp % went from 0.4 to 0.25 won    -37.50%

false negative percentages
    0.500  0.500  tied          
    0.000  0.000  tied          
    0.000  0.000  tied          
    0.000  0.000  tied          
    0.500  0.500  tied          
    0.500  0.500  tied          
    0.000  0.000  tied          
    0.000  0.000  tied          
    0.000  0.000  tied          
    0.000  0.000  tied          

won   0 times
tied 10 times
lost  0 times

total unique fn went from 3 to 3 tied          
mean fn % went from 0.15 to 0.15 tied          

--Guido van Rossum (home page: http://www.python.org/~guido/)