[Spambayes] Feature idea.
johng at atser.com
Thu Feb 19 13:47:37 EST 2004
I really love SpamBayes and have sold my whole company on it. It works great even for those ditzy blondes in reception who tend to
try to unsubscribe to every spam they get in their box.
I have a suggestion for Spam Bayes. This is regards to the threshold feature. ie: you can raise or lower the score filtering criterias. (ie:
for Spam/Unsure and Inboxes etc)
Over time, I would suspect the messages, statistically would create a "camel" two hump curve. ie: two sets of distributions (I know there is a more technical term for that
in statistics but it slips my mind atm). Over time, the humps would grow and the minima shift left a little as more and more clever spams are eliminated to the right side of the distribution)
I would suspect the best place to set your thresholds would be between the ham and spam distribution humps. Or have your unsure
zone be so many points away from that minima. It would be nice then to have a checkbox to enable automatic adjustment of the
filtering criteria. (ie: over time, mine has gone down from 75% spam scores and above to 15% and above since I have a large hump
after 15% and a smaller ham hump before the 15% mark. (IOW, the filter is getting very good and goes lower as it goes. but I'm having to manually
do statistics and adjust the filter so as to get very good accuracy out of spambayes.
Just an idea.
BTW, I am a developer with some statistical/math background. I might consider contributing after I familiarize myself more with the group
in operation here. but there is a high probability that I may just leave this as a good comment until frustration over not seeing this feature implemented
goes over my limits.
More information about the Spambayes