[spambayes-dev] Incorrect Outlook stats

Kenny Pitt kennypitt at hotmail.com
Tue Nov 23 14:59:30 CET 2004


Tony Meyer wrote:
>> So, in order to test the
>> statistics for incorrect classifications, I trained one of my good
>> messages as spam and checked to see that it showed up as a false
>> negative. 
>> 
>> I then trained the message back to good, but the statistics still
>> showed that I had one false negative.  It seems like the correct
>> behavior would be to erase the false negative if the message is
>> trained back to its original classification.
> 
> An odd case, to be sure :)

Yes, I'll admit that, but that's the life of a developer, eh? <wink>  I
guess it's similar to what would happen if someone clicked the wrong
training button on a message and then had to recover it, though.

> AFAICT the only way** to fix this would be to add more information to
> the message_db (a la the non-Outlook version).  I believe we can do
> this in a backwards-compatible way, although there will be a
> reasonable number of changes, I suspect.  Should I go ahead and do
> this?    
> 
> * Of course, we store only the score, so if the thresholds have
> changed, all bets are off. 
> 
> ** Well, other than adding another field to the message, or something
> like that. 

I was afraid that would be the case.  I wouldn't be opposed to adding an
"original score" field to the message if that's the easiest way, but I
suspect that putting it into the message db would be a better approach.  The
decisions we make about what stats to update would be controlled by both the
original classification and the training status of the message, so it seems
like it would be best to reset all of this when the training data is reset.
That would be much easier to do in the message db than in the fields of the
messages.

-- 
Kenny Pitt



More information about the spambayes-dev mailing list