[Spambayes] Perhaps a level header would be useful?
wsy at merl.com
Mon Mar 10 12:32:21 EST 2003
From: Skip Montanaro <skip at pobox.com>
Classification is being done by me on the server, not by the users on their
desktops. I just just chatting with a couple of the admins here who
commented that SpamAssassin's X-Spam-Level header is nice because you can
tell users to just add or delete a star from their Eudora filter to
fine-tune the break between spam and ham.
That might be a bit weird with Spambayes since it's a three-state system,
but I think it might be useful to add an X-Spambayes-Level header where the
number of stars is equal to int(score*10). I control the ham and spam
cutoffs, and thus the inclusion of the words "ham", "unsure" and "spam", but
this would make it easy for people to filter on a score basis in their mail
client. Sort of a fine-tuning knob.
I've also had multiple requests for a continuous output match parameter in
CRM114, so I settled on this:
pR = - (log (Pspam) - log (Pnonspam)
This goes from roughly +350 to -350, and (nicely) the uncertains
and errors all seem to group around +/- 100 .
90%+ of the messages come out either > 200 or < -200, so it's an
effective human-understood representation.
I know the CAMRAM people wanted it pretty badly; expect them to
start using it soon.
(it's called pR for the same reason pH is called pH - it's the
negative log of the ratios of the match probabilities, just like
pH is the negative log of the ion ratios.)
More information about the Spambayes