[spambayes-bugs] [ spambayes-Bugs-821808 ] sb_mboxtrain.py fails to
mark X-Spambayes-Trained
SourceForge.net
noreply at sourceforge.net
Sat Oct 11 11:46:42 EDT 2003
Bugs item #821808, was opened at 2003-10-11 08:46
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=821808&group_id=61702
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alan W. Irwin (airwin)
Assigned to: Nobody/Anonymous (nobody)
Summary: sb_mboxtrain.py fails to mark X-Spambayes-Trained
Initial Comment:
Symptoms:
I have chosen a two-message mbox folder called libtool as an
example, but I get the same result with larger folders as well.
irwin at starling> sb_mboxtrain.py -d ~/.spambayes/hammie.dbm
-g ~/cdburn0/Mail/libtool
Training ham (/home/irwin/cdburn0/Mail/libtool):
Reading as Unix mbox
Trained 2 out of 2 messages
irwin at starling> sb_mboxtrain.py -d ~/.spambayes/hammie.dbm
-g ~/cdburn0/Mail/libtool
Training ham (/home/irwin/cdburn0/Mail/libtool):
Reading as Unix mbox
Trained 2 out of 2 messages
Note the second time around it still trains 2 messages rather
than 0. Also, that folder remains absolutely unchanged by these
training efforts with a september date
ls -l ~/cdburn0/Mail/libtool -rw------- 1 irwin irwin 4290
Sep 11 15:20 /home/irwin/cdburn0/Mail/libtool
and no extra mail header line referring to X-Spambayes-Trained
Configuration file:
cat ~/.spambayesrc
[Headers] include_trained: True
[Storage] persistent_storage_file: ~/.spambayes/hammie.dbm
persistent_use_database: True
The include_trained: True should be redundant (since it is
default), but I tried it anyway to force the X-Spambayes-Trained
header to be in, but it didn't work.
I believe this bug has serious consequences since there is no way
to retrain spambayes with the recommended cron tasks using
sb_mboxtrain.py since it acts as if the -f option was on all the
time. To users unaware of this bug, the database gets slowly
distorted by the cumulative repeating of the same data with no
correction of previous classification mistakes possible.
Of course, one workaround presumably (I haven't tried this yet)
is to remove your database in the cron task and start from
scratch every time, but this is somewhat wasteful of resources
for the huge spam and ham collection of mail folders I have
collected.
If others here have trouble reproducing this bug, then here are
some details about my system:
I am running a Debian stable Linux distribution which I have
modified by downloading and installing the python 2.3.2 tarball,
Python-2.3.2.tgz, from python.org.
python
Python 2.3.2 (#1, Oct 10 2003, 17:38:20)
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more
information.
>>>
I have downloaded and installed spambayes-1.0a6.1.tar.gz
from sourceforge.net.
If there is any difficulty verifying this bug, I will be happy to supply
more details about my system, run more tests, etc., since spam
waits for no man, and it is fairly urgent I get it fixed.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=821808&group_id=61702
More information about the Spambayes-bugs
mailing list