[Spambayes] Training corrupts mbox files
Neale Pickett
neale at woozle.org
Tue Apr 29 19:59:56 EDT 2003
David McLaughlin <david at dsmcl.net> writes:
> Thanks for taking a look at it!
>
> I have put a sample before and after mbox at the following location:
>
> ftp://ftp.dsmcl.net/spambayes_samplembox.tgz
>
> It looks like it may be duplicating some lines in the header, and
> adding an extra line break, which generates "extra" bogus mail
> messages.
Yeah, sure enough. You're using mutt?
The mbox "standard" is that any line beginning with "From " denotes a
new messages. So a diff of those two mailboxes shows things like this:
From removed at example.com Mon Apr 28 16:37:18 2003
Return-Path: <removed>
Delivered-To: <removed>
+X-Spambayes-Trained: spam
+
From removed at example.com Mon Apr 28 16:37:18 2003
Return-Path: <removed>
Delivered-To: <removed>
I think spambayes is actually doing the right thing here--it's taking a
weird mbox and un-weirding it. I think Tim Stone might be working on a
generic message store thingy: Tim, would that eliminate the need to
rewrite mailboxes altogether?
But David, if I were you I'd start trying to hunt down what's creating
those duplicate headers. It might be some sort of wonky procmail recipe
that just writes out headers and then drops through, but that's just a
shot in the dark guess. Heh, maybe it's hammiefilter <0.7 wink>
Neale
More information about the Spambayes
mailing list