[spambayes-bugs] [ spambayes-Bugs-821808 ] sb_mboxtrain.py fails to mark X-Spambayes-Trained

SourceForge.net noreply at sourceforge.net
Sat Oct 11 11:46:42 EDT 2003


Bugs item #821808, was opened at 2003-10-11 08:46
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=821808&group_id=61702

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alan W. Irwin (airwin)
Assigned to: Nobody/Anonymous (nobody)
Summary: sb_mboxtrain.py fails to mark X-Spambayes-Trained

Initial Comment:
Symptoms:   
 
I have chosen a two-message mbox folder called libtool as an 
example, but I get the same result with larger folders as well.   
 
irwin at starling> sb_mboxtrain.py -d ~/.spambayes/hammie.dbm 
-g ~/cdburn0/Mail/libtool 
Training ham (/home/irwin/cdburn0/Mail/libtool): 
  Reading as Unix mbox 
  Trained 2 out of 2 messages 
irwin at starling> sb_mboxtrain.py -d ~/.spambayes/hammie.dbm 
-g ~/cdburn0/Mail/libtool 
Training ham (/home/irwin/cdburn0/Mail/libtool): 
  Reading as Unix mbox 
  Trained 2 out of 2 messages  
 
Note the second time around it still trains 2 messages rather 
than 0.  Also, that folder remains absolutely unchanged by these 
training efforts with a september date  
ls -l ~/cdburn0/Mail/libtool -rw-------    1 irwin    irwin        4290 
Sep 11 15:20 /home/irwin/cdburn0/Mail/libtool  
 
and no extra mail header line referring to X-Spambayes-Trained  
 
Configuration file: 
 
cat ~/.spambayesrc 
[Headers] include_trained: True 
[Storage] persistent_storage_file: ~/.spambayes/hammie.dbm 
persistent_use_database: True  
 
The include_trained: True should be redundant (since it is 
default), but I tried it anyway to force the X-Spambayes-Trained 
header to be in, but it didn't work.   
 
I believe this bug has serious consequences since there is no way 
to retrain spambayes with the recommended cron tasks using 
sb_mboxtrain.py since it acts as if the -f option was on all the 
time.  To users unaware of this bug, the database gets slowly 
distorted by the cumulative repeating of the same data with no 
correction of previous classification mistakes possible.   
 
Of course, one workaround presumably (I haven't tried this yet) 
is to remove your database in the cron task and start from 
scratch every time, but this is somewhat wasteful of resources 
for the huge spam and ham collection of mail folders I have 
collected.   
 
If others here have trouble reproducing this bug, then here are 
some details about my system:   
 
I am running a Debian stable Linux distribution which I have 
modified by downloading and installing the python 2.3.2 tarball, 
Python-2.3.2.tgz, from python.org.   
 
python  
Python 2.3.2 (#1, Oct 10 2003, 17:38:20)  
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2  
Type "help", "copyright", "credits" or "license" for more 
information.  
>>>   
 
I have downloaded and installed  spambayes-1.0a6.1.tar.gz 
from sourceforge.net.   
 
If there is any difficulty verifying this bug, I will be happy to supply 
more details about my system, run more tests, etc., since spam 
waits for no man, and it is fairly urgent I get it fixed. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=821808&group_id=61702



More information about the Spambayes-bugs mailing list