Tony Meyer tameyer at ihug.co.nz
Fri Nov 26 01:20:52 CET 2004

> Have recently upgrades to v1.0 spambayes. Using Calypso mail 
> client with spambayes proxy server. I now get the following 
> header added to every email:
> X-Spambayes-Exception: Traceback (most recent call last): . 
> File "sb_server.pyc", line 475, in onRetr . File 
> "spambayes\classifier.pyc", line 190, in chi2_spamprob . File 
> "spambayes\classifier.pyc", line 493, in _getclues . File 
> "spambayes\classifier.pyc", line 508, in _worddistanceget . 
> File "spambayes\classifier.pyc", line 308, in probability 
> .AssertionError

It's very rare to see this error these days.  What it means it that your
database has an entry that has been seen in more ham than the total number
of ham messages you have trained (obviously impossible).  The only cause I
can think of for this is that training was interrupted at just the right (or
wrong, depending on your point of view) moment.

There are two options to fix this:

  1.  The best option, by far, is to retrain from scratch.  Delete the two
database files and start training again - training is very quick, so it'll
hardly take any time to get back to high accuracy.  The FAQ explains how to
go about doing this.

  2.  You can manually repair this error in the database.  You'd have to
install Python (if you haven't already) and use the source version.  You use
the sb_dbexpimp.py script to convert the database to a CSV file, which you
can then open (eg in Excel) and change the numbers at the top to be at least
as large as the largest values in each of the ham & spam columns.  You then
use the sb_dbexpimp.py script to convert the fixed database back to the
original format.  We don't recommend this, as the problem might indicate
larger issues with the database, so it's much more reliable to just retrain.

Sorry the news isn't better!


