[spambayes-bugs] [ spambayes-Bugs-797890 ] Assertion errors from classifier for new messages

SourceForge.net noreply at sourceforge.net
Mon Sep 1 10:21:14 EDT 2003


Bugs item #797890, was opened at 2003-08-30 12:37
Message generated for change (Comment added) made by avitous
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=797890&group_id=61702

Category: pop3proxy
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anderson J. Vitous (avitous)
Assigned to: Nobody/Anonymous (nobody)
Summary: Assertion errors from classifier for new messages

Initial Comment:
Installed from CVS latest version as of 8/30/03 (1.0a4 
wouldn't work for me so reinstalled).  Python version is 2.
2.3 on Windows XP with latest pybsddb installed.  Trained 
with recent spam/ham collections (about equal numbers 
of messages, approx. 400 each), configured proxy.  Using 
Mahogany mail client pointed to pop3proxy (localhost).

Each new message retrieved results in an assertion 
failure:

AssertionError
Traceback (most recent call last):
  File "D:\apps\spambayes\pop3proxy.py", line 439, in 
onRetr
    evidence=True)
  File "D:\apps\spambayes\spambayes\classifier.py", line 
223, in chi2_spamprob
    clues = self._getclues(wordstream)
  File "D:\apps\spambayes\spambayes\classifier.py", line 
451, in _getclues
    prob = self.probability(record)
  File "D:\apps\spambayes\spambayes\classifier.py", line 
307, in probability
    assert hamcount <= nham

(from console running pop3proxy.py)

Also inserted into headers of each message:
X-Spambayes-Exception: exceptions.AssertionError() in 
probability() at
	D:\apps\spambayes\spambayes\classifier.py line 
307: assert
	hamcount <= nham


----------------------------------------------------------------------

>Comment By: Anderson J. Vitous (avitous)
Date: 2003-09-01 09:21

Message:
Logged In: YES 
user_id=353986

My original install attempt was with 1.0a4 with default db 
(dumbdbm) and I had corruption problems.  I then downloaded 
CVS snapshot, noted I needed pybsddb (running Python 2.2.3) 
so I installed it before installing SpamBayes-cvs and 
discovering the issue I reported here..  When I reverted to 1.
0a4 I used a fresh install, and it worked since pybsddb was 
now present.

----------------------------------------------------------------------

Comment By: Richie Hindle (richiehindle)
Date: 2003-08-30 16:23

Message:
Logged In: YES 
user_id=85414

Could you please clarify something?  You say "adding pybsddb
helped", but your initial problem description says "...with
latest
pybsddb installed."  When you were seeing the problems, did
you have pybsddb installed or not?


----------------------------------------------------------------------

Comment By: Anderson J. Vitous (avitous)
Date: 2003-08-30 16:19

Message:
Logged In: YES 
user_id=353986

Just got 1.0a4 working (adding pybsddb helped) and although 
it has a particular training issue thru smtp proxy it works 
against the corpuses which caused the error documented here 
in CVS snapshot version.


----------------------------------------------------------------------

Comment By: Anderson J. Vitous (avitous)
Date: 2003-08-30 14:55

Message:
Logged In: YES 
user_id=353986

I'd love to help track this down, but I cannot send you my 
corpuses (private data) and don't have time right now to 
'sanitize' it.  Perhaps later this weekend I can find the time, 
but I don't want to give away anybody's emails in the process.

I'm training through the web interface, with culled most-recent 
message data for spam/ham corpuses.  No errors show up 
during this exercise; messages are imported as a single mbox 
file for each corpus.  Subsequently I can query on various 
words and the responses seem to be reasonable.

Error shows up when I subsequently shut down, restart 
pop3proxy.py (was having problem with smtp proxy not 
working with 1.0a4 so got used to doing that...), connect with 
my mail client, and retrieve new messages; every message 
results in the assertion error.

Please let me know what else I can do besides sending private 
data:  what kind of debug logs can be written while exercise 
this?

----------------------------------------------------------------------

Comment By: Richie Hindle (richiehindle)
Date: 2003-08-30 13:01

Message:
Logged In: YES 
user_id=85414

This is a bug that's been cropping up from time to time,
but we haven't been able to reproduce it.  It sounds from
your description that you have a way to reproduce it - just
train on your corpuses and it fails straight away...?  If that's
true, could you attach your corpuses to this bug report?
Or if you've private messages in there, would you be willing
to send the corpuses to me directly?  I'd love to be able to
track this one down.

If you do send/attach your corpuses, please zip them up
to guarantee they don't get mangled by intermediate
mail/web servers.

How are you training?  Through the web, or on the command
line?  If on the command line, what is the exact command
you're using?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=797890&group_id=61702



More information about the Spambayes-bugs mailing list