[spambayes-bugs] [ spambayes-Bugs-1101281 ] imapfilter with mysql on mac has assertion error

SourceForge.net noreply at sourceforge.net
Thu Jan 13 00:13:10 CET 2005


Bugs item #1101281, was opened at 2005-01-13 11:54
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1101281&group_id=61702

Category: imapfilter
Group: 1.0.1
Status: Open
Resolution: None
Priority: 5
Submitted By: jscott (jscottjfs)
Assigned to: Tony Meyer (anadelonbrin)
Summary: imapfilter with mysql on mac has assertion error

Initial Comment:
Using persistent_use_database=False trains OK

Keeping everything else the same, and switching to
mysql leads to the following errors (or similar
assertion errors involving nspam instead) after a
couple minutes of training.

the mysql database has this:

mysql> describe bayes;
+-------+--------------+------+-----+---------+-------+
| Field   | Type          | Null  | Key | Default |
Extra  |
+-------+--------------+------+-----+---------+-------+
| word  | varchar(255) |      | PRI |         |       |
| nspam | int(11)        |      |      | 0       |       |
| nham  | int(11)        |      |      | 0       |       |
+-------+--------------+------+-----+---------+-------+

and 


mysql> select count(word) from bayes;
+-------------+
| count(word) |
+-------------+
|       20125 |
+-------------+


so everything is working well.  Then, somehow, the training
training runs amuck crashing imapfilter and giving this:



[dhcp-235-023:~/spambayes-1.0.1/scripts] jscott% python
sb_imapfilter.py -c -t -l -5
SpamBayes IMAP Filter Version 0.5 (November 2004)
and engine SpamBayes Engine Version 0.3 (January 2004).

Traceback (most recent call last):
  File "sb_imapfilter.py", line 924, in ?
    run()
  File "sb_imapfilter.py", line 914, in run
    imap_filter.Filter()
  File "sb_imapfilter.py", line 785, in Filter
    self.unsure_folder)
  File "sb_imapfilter.py", line 703, in Filter
    evidence=True)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages/spambayes/classifier.py",
line 190, in chi2_spamprob
    clues = self._getclues(wordstream)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages/spambayes/classifier.py",
line 493, in _getclues
    tup = self._worddistanceget(word)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages/spambayes/classifier.py",
line 508, in _worddistanceget
    prob = self.probability(record)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages/spambayes/classifier.py",
line 308, in probability
    assert hamcount <= nham
AssertionError


----------------------------------------------------------------------

>Comment By: Tony Meyer (anadelonbrin)
Date: 2005-01-13 12:13

Message:
Logged In: YES 
user_id=552329

The problem is that nham (nspam) is meant to be the total
number of ham (spam) messages that you have trained.  It
looks like it's 0 above, which is not good.

Offhand, I'm not sure what would cause this - updating the
nham/nspam values is done at the same time as the token
counts, so if one is wrong, they really ought to both be.

I'll try and find time to try and replicate this here later
today and update with what happens.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1101281&group_id=61702


More information about the Spambayes-bugs mailing list