[Spambayes] bug: long lines trashed w/ mboxtrain.py
Sam Shah
shahsam at eecs.umich.edu
Wed Feb 26 15:34:03 EST 2003
Consider a simple mbox like (excuse long to line):
From test at umich.edu Fri Jan 24 17:18:05 2003
From: "Test" <test at umich.edu>
To: "Someone Test #A" <someone at eecs.umich.edu>,<someone at eecs.umich.edu>,"Someone Test #B" <someone at umich.edu>, "Someone Test #C" <someone at eecs.umich.edu>, "Someone Test #D" <someone at eecs.umich.edu>
Subject: testing
Date: Fri, 24 Jan 2003 17:21:35 -0500
Status: RO
Content-Length: 5
Lines: 1
Test
We run mboxtrain.py on it, twice:
[sligo tmp/spambayes-1.0a2 ]$ ./mboxtrain.py -d test -g msg
Training ham (msg):
Reading as Unix mbox
Trained 1 out of 1 messages
[sligo tmp/spambayes-1.0a2 ]$ ./mboxtrain.py -d test -g msg
Training ham (msg):
Reading as Unix mbox
Trained 0 out of 1 messages
We get the following. Notice the To line is somehow munged.
From test at umich.edu Fri Jan 24 17:18:05 2003
From: "Test" <test at umich.edu>
To: "Someone Test #A"
<someone at eecs.umich.edu>,<someone at eecs.umich.edu>,"Someone Test #B"
"Someone Test #D" <someone at eecs.umich.edu>
Subject: testing
Date: Fri, 24 Jan 2003 17:21:35 -0500
Status: RO
Content-Length: 5
Lines: 1
X-Spambayes-Trained: ham
Test
I tried looking at the code for mboxtrain, but I couldn't find an
obvious problem. This occurs with Spambayes v1.0a2 running on
Python 2.2.2.
Thanks,
Sam
More information about the Spambayes
mailing list