Third result ... RE: [Spambayes] First result from Gary Robinson's ideas

Sjoerd Mullender sjoerd@acm.org
Thu, 19 Sep 2002 11:54:08 +0200


On Thu, Sep 19 2002 Tim Peters wrote:

> I've checked in sane changes to the code base now, so that you can try Gary
> Robinson's increasingly remarkable probability combining scheme (it was
> amazing when I first tried it, and triply so when Neale reported isomorphic
> results on a wholly different corpus).  Just set
> 
>     [Classifier]
>     use_robinson_probability: True
> 
>     [TestDriver]
>     spam_cutoff: 0.50

Here are my results.  run1 was default, run2 with the above settings.

By the way, I'm using runtest.sh, so I guess I'm the number two and
not Guido.  :-)

run1s -> run2s
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
-> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams

false positive percentages
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied

won   0 times
tied  5 times
lost  0 times

total unique fp went from 0 to 0 tied
mean fp % went from 0.0 to 0.0 tied

false negative percentages
    2.516  2.516  tied
    0.000  0.000  tied
    1.258  1.258  tied
    2.516  2.516  tied
    1.258  1.258  tied

won   0 times
tied  5 times
lost  0 times

total unique fn went from 12 to 12 tied
mean fn % went from 1.50943396226 to 1.50943396226 tied

The before case:

Ham distribution for all runs:
* = 14 items
  0.00 794 *********************************************************
  2.50   1 *
  5.00   0 
  7.50   0 
 10.00   0 
 12.50   0 
 15.00   0 
 17.50   0 
 20.00   0 
 22.50   0 
 25.00   0 
 27.50   0 
 30.00   0 
 32.50   0 
 35.00   0 
 37.50   0 
 40.00   0 
 42.50   0 
 45.00   0 
 47.50   0 
 50.00   0 
 52.50   0 
 55.00   0 
 57.50   0 
 60.00   0 
 62.50   0 
 65.00   0 
 67.50   0 
 70.00   0 
 72.50   0 
 75.00   0 
 77.50   0 
 80.00   0 
 82.50   0 
 85.00   0 
 87.50   0 
 90.00   0 
 92.50   0 
 95.00   0 
 97.50   0 

Spam distribution for all runs:
* = 14 items
  0.00  12 *
  2.50   0 
  5.00   0 
  7.50   0 
 10.00   0 
 12.50   0 
 15.00   0 
 17.50   0 
 20.00   0 
 22.50   0 
 25.00   0 
 27.50   0 
 30.00   0 
 32.50   0 
 35.00   0 
 37.50   0 
 40.00   0 
 42.50   0 
 45.00   0 
 47.50   0 
 50.00   0 
 52.50   0 
 55.00   0 
 57.50   0 
 60.00   0 
 62.50   0 
 65.00   0 
 67.50   0 
 70.00   0 
 72.50   0 
 75.00   0 
 77.50   0 
 80.00   0 
 82.50   0 
 85.00   0 
 87.50   0 
 90.00   0 
 92.50   0 
 95.00   0 
 97.50 783 ********************************************************

The after case:

Ham distribution for all runs:
* = 14 items
  0.00 787 *********************************************************
  2.50   0 
  5.00   0 
  7.50   0 
 10.00   0 
 12.50   0 
 15.00   1 *
 17.50   2 *
 20.00   0 
 22.50   2 *
 25.00   0 
 27.50   1 *
 30.00   0 
 32.50   0 
 35.00   0 
 37.50   1 *
 40.00   0 
 42.50   0 
 45.00   0 
 47.50   1 *
 50.00   0 
 52.50   0 
 55.00   0 
 57.50   0 
 60.00   0 
 62.50   0 
 65.00   0 
 67.50   0 
 70.00   0 
 72.50   0 
 75.00   0 
 77.50   0 
 80.00   0 
 82.50   0 
 85.00   0 
 87.50   0 
 90.00   0 
 92.50   0 
 95.00   0 
 97.50   0 

Spam distribution for all runs:
* = 12 items
  0.00   1 *
  2.50   0 
  5.00   0 
  7.50   0 
 10.00   0 
 12.50   0 
 15.00   0 
 17.50   0 
 20.00   0 
 22.50   0 
 25.00   2 *
 27.50   1 *
 30.00   1 *
 32.50   0 
 35.00   0 
 37.50   3 *
 40.00   0 
 42.50   2 *
 45.00   2 *
 47.50   0 
 50.00   2 *
 52.50   2 *
 55.00   6 *
 57.50   7 *
 60.00   6 *
 62.50   6 *
 65.00   5 *
 67.50   6 *
 70.00  11 *
 72.50   4 *
 75.00   1 *
 77.50   5 *
 80.00   5 *
 82.50   6 *
 85.00   5 *
 87.50   3 *
 90.00   1 *
 92.50   0 
 95.00   2 *
 97.50 700 ***********************************************************

-- Sjoerd Mullender <sjoerd@acm.org>