[Spambayes] Proposing to remove 4 combining schemes

Rob W. W. Hooft rob@hooft.net
Thu Oct 17 15:12:06 2002


Sean True wrote:
> I hate to try to speak for Joe User (like speaking for the "common man",
> always a red flag), but I _am_ just a user of these scoring schemes. I have
> several hundred messages (commercial email) tucked away in a folder that
> score in the non-chi scheme in the range .4 to .6. That score appears to
> reflect my own real uncertainty about the value of Motley Fool newsletters.
> No snickering, please. A system like chi- looks like a very good choice for
> black and white, upstream discards offers to increase body part size.
> 
> But I don't want these messages automatically discarded upstream, I want
> them labelled so that I can deal with them more efficiently.
> 
> When I sort this particular folder by spam score, I get MIT club and
> Infoworld newsletters at the the beginning (the good end), and the Motley
> Fool and Edgar Online at the other end, with a range of spam score from .2
> to .6 Just right. If I could color them continuously, it would be easy to
> spot the ones I want to read, now. And over time, as I change my definition
> of spam, their position in the list looks like it will vary smoothly -- and
> appropriately.
> 
> This may not fit your original mission statement, but mission statements
> often don't survive contact with the enemy, err, customer.

But I agree 100%! Sorting on the spamminess/hamminess is very useful. 
Coloring on the spamminess/hamminess is very useful. But only in the 
middle ground folder. And the numeric values as such are useless, that 
is MHO. Part of my work is to make "clean" user interfaces, and I am 
allergic to showing things that the user can't do anything with.

I understood the original idea of Tim as that he wanted to see the 
spamminess of clearcut spam and the hamminess of clearcut ham. I don't 
see the point of that, but there would be an easy way to do it: Remap 
the probabilities such that 0->0; hamcutoff->0.33; spamcutoff->0.66; 
1->1 using any monotonic increasing function (e.g. three linear segments).

Rob

-- 
Rob W.W. Hooft  ||  rob@hooft.net  ||  http://www.hooft.net/people/rob/