[Spambayes] (non-)spam count would go negative!

Meyer, Tony T.A.Meyer at massey.ac.nz
Thu May 8 12:31:49 EDT 2003


> So I moved these [incorrectly classified]
> message to their rightful folders, deleted the 
> database, and retrained.  But this time, mboxtrain.py dies 
> with a message
> 
>   (non-)spam count would go negative!
> 
> when it gets across one of the reclassified messages.

The reason this is happening is because mboxtrain recognises that you
have moved messages and tries to untrain before it does the new train.
There are two solutions to this:
  * Don't delete the database.  If you just move the messages and
retrain, mboxtrain will find the old header, realise that the
classication has changed, and do the appropriate unlearn/learn.
  * Delete the database and use the -f switch.  If you look in the doc
(-h), you'll see that this is for rebuilding from scratch.  Basically
this ignores the headers so that the appropriate untraining is not done.

> Is the notion of marking messages used 
> for training with a header line really well thought out?  
> From my limited experience with this "feature", I would suggest not.

It's really not a problem with marking them with a header line.  (On the
other hand, it could be a problem that the documentation isn't clear
enough about what should be done in this sort of situation).

However, there is a project at the moment to hold this sort of training
information in a database of it's own (for other reasons).  It's being
tested with pop3proxy and imapfilter at first, but the aim is that it
will eventually be used by mboxtrain/hammie/Outlook/everything else.

=Tony Meyer



More information about the Spambayes mailing list