[Spambayes] Question on "Save and shutdown" button

Tim Peters tim.peters at gmail.com
Wed Jul 21 06:55:25 CEST 2004


[samnicholls at appintec.com]
...
> COMMENT #2 -- The discipline of having to train has made me realize that, in
> my case at least, there is a third category of email, that which is not ham, yet it
> does not fall into the pure definition of spam either.

SpamBayes has no definitions of ham or spam, of course -- they're
whatever you tell it is ham and spam.  You can use any definitions you
like.  For example, I used to get a particular kind of "joke of the
day" spam, hawking everything from "male enhancement" products to
human growth hormone, but I liked the jokes.  I trained on those as
ham.  SB didn't report me to the authorities <wink> -- and it didn't
start believing that all ads for HGH were ham either.

> Examples of this are newsletters from organizations with which I have a loose
> association, and advertisements from companies who have apparently
> purchased a legitimate (?) email list of those who are in my rather narrow vertical
> industry.  Yet for accurate training it is obviously so important to be consistent
> when categorizing the "unsures", and what might be undesirable in the "ham".
>  When an email like this does make it through to my Inbox, I'm in a quandary as
> to how to categorize it.

You don't *have* to categorize it.  SB should be a helpful servant,
not a dictator.  I get a significant amount of email in this grey area
too, and I'm happy to delete it leaving it as Unsure.  IOW, if I have
to think twice about whether a given msg "is ham" or "is spam", then
it is in fact Unsure even to me -- so be it.

I've tried consistently training on such things as "ham" or "spam",
but it doesn't do any good:  because I change *my* mind about it from
day to day (highly correlated with how busy I am on different days!),
I end up training the same kinds of things into both categories, and
then they end up scoring Unsure anyway.

Ambiguity is a fact of life in many areas.  I think email is one of
them.  Leaving Unsures alone, SB's idea of "unsure" has gotten quite
close to my own.


More information about the Spambayes mailing list