[Spambayes] How are good messages treated?

Fri Jun 27 14:49:15 EDT 2003

>> To phrase it another way, are the only ham words added to the
>> database the ones from messages that I manually 
>> indicate as ham?
[Tony]
> Yes.
[Skip]
> Are you sure?  In SpamAtBay at least, my inbox has no "save 
> as ham" button. Similarly, my spam box has no "save as spam" 
> button.  On my unsure box has both.  I believe it trains 
> automatically on messages classified as ham or spam, then if 
> you change the classification, it untrains as one then 
> retrains as the other.

Is this maybe a SpamAtBay/Spambayes difference?  (I haven't used
SpamAtBay).  My inbox and spam box have the appropriate buttons - but
then, *every* mail folder apart from the spam box has the "delete as
spam" button.  (The "recover from spam" button is present in the unsure
box and the spam box).

Looking at the code, there is this in addin.py (in the spam box
watcher):
        # we assume that if the calculated spam prob
        # was *not* certain-spam, or it is in the ham corpa,
        # then it should be trained as such.
So only messages that are correctively moved into the spam folder are
trained.  (And given code a bit lower, messages that have not been
scored at all (don't have the field) are not trained either) - this
explains my earlier confusion over moving messages into watched
folders).

For ham box watchers, if the message doesn't have a score yet (i.e. it
is new) it is not trained.  If it does, and the option to retrain is
set, then it is trained on, *as long as it is not already classified as
ham*.

Confusing, huh?  Works good, though... :)  What SpamAtBay does, I don't
know - given that you have different appearances of the buttons, perhaps
it does train on everything.

=Tony Meyer