[Spambayes] Perhaps a level header would be useful?

Bill Yerazunis wsy at merl.com
Mon Mar 10 12:32:21 EST 2003


   From: Skip Montanaro <skip at pobox.com>

   Classification is being done by me on the server, not by the users on their
   desktops.  I just just chatting with a couple of the admins here who
   commented that SpamAssassin's X-Spam-Level header is nice because you can
   tell users to just add or delete a star from their Eudora filter to
   fine-tune the break between spam and ham. 

   That might be a bit weird with Spambayes since it's a three-state system,
   but I think it might be useful to add an X-Spambayes-Level header where the
   number of stars is equal to int(score*10).  I control the ham and spam
   cutoffs, and thus the inclusion of the words "ham", "unsure" and "spam", but
   this would make it easy for people to filter on a score basis in their mail
   client.  Sort of a fine-tuning knob.

   or-a-fake-thermostat-ly, y'rs,

I've also had multiple requests for a continuous output match parameter in
CRM114, so I settled on this:

      pR = - (log (Pspam) - log (Pnonspam)

This goes from roughly +350 to -350, and (nicely) the uncertains 
and errors all seem to group around +/- 100 . 

90%+ of the messages come out either > 200 or < -200, so it's an 
effective human-understood representation.

I know the CAMRAM people wanted it pretty badly; expect them to 
start using it soon.

(it's called pR for the same reason pH is called pH - it's the 
negative log of the ratios of the match probabilities, just like
pH is the negative log of the ion ratios.)

  -Bill Yerazunis




More information about the Spambayes mailing list