[Spambayes] Need testers!

Sat, 14 Sep 2002 19:58:40 -0400

[Tim]
> ...
> Since I got rid of MINCOUNT, a clue in a single ham message is as
> strong as a clue that appears in every ham message.  This makes me
> squirm, but testing said it was a win.  (BTW, I've tried, but not
> reported on, a dozen schemes to give less weight to high-prob clues
> that appear in few messages; every such attempt has been a loser.)

The one I just checked in may be a winner, though.  It needs testing on a
wide variety of corpora and corpora sizes.

1. Run a baseline and save a summary file (rates.py).

2. Make exactly one change, adding

[Classifier]
adjust_probs_by_evidence_mass: True

   to your bayescustomize.ini file.

3. Run the same test scenario again, and create another summary file.

4. Run cmp.py over the summary files and post the cmp.py output
   (all of it, please).  Or mail it to me, but I think there's value
   in public humiliation <wink> if it helps get better results.

Here's my before-and-after cmp.py:

Before: adjust_probs_by_evidence_mass: False
After:  adjust_probs_by_evidence_mass: True

"""
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams
-> <stat> tested 2000 hams & 1375 spams against 18000 hams & 12375 spams

false positive percentages
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.050  lost  +(was 0)
    0.000  0.000  tied
    0.050  0.050  tied
    0.000  0.050  lost  +(was 0)
    0.000  0.000  tied
    0.050  0.050  tied
    0.000  0.000  tied
    0.100  0.050  won    -50.00%

won   1 times
tied  7 times
lost  2 times

total unique fp went from 4 to 5 lost   +25.00%
mean fp % went from 0.02 to 0.025 lost   +25.00%

false negative percentages
    0.218  0.073  won    -66.51%
    0.364  0.218  won    -40.11%
    0.000  0.000  tied
    0.218  0.145  won    -33.49%
    0.218  0.218  tied
    0.291  0.218  won    -25.09%
    0.218  0.291  lost   +33.49%
    0.145  0.218  lost   +50.34%
    0.291  0.291  tied
    0.073  0.000  won   -100.00%

won   5 times
tied  3 times
lost  2 times

total unique fn went from 28 to 23 won    -17.86%
mean fn % went from 0.203636363636 to 0.167272727273 won    -17.86%
"""

If you can make time, also try two other variations, changing the line

                    dist *= sum / (sum + 1.0)

in classifier.GrahamBayes.update_probabilities().  Try it once replacing 1.0
with 2.0, and another time replacing 1.0 with 0.5.  (The closer this
constant is to 0, the more this will act as if the new code weren't there.)

If you want to have some fun too <wink>, read the comments and try a
different way to get what it's aiming at.  A little thought separated by
hours of testing is how real progress is made on this kind of thing, though.