[Spambayes] 1.0rc1, modifies Received headers

Sun May 30 03:34:56 EDT 2004

>     http://www.faqs.org/rfcs/rfc2822.html
> 
> and the way the long Received line was broken up conforms to the
> standard. The reason Python's email package breaks it is that RFC 2822
> says (in part):
>    There are two limits that this standard places on the number of
> characters in a line.  Each line of characters MUST be no more than
> 998 characters, and SHOULD be no more than 78 characters, excluding
> the CRLF.

OK, it appears to not be a problem them. But since Postfix wrote the 
headers originally, I was used to seeing them like that.

> Whoever produced the long Received line originally wasn't following the
> standard's recommendation, and Python's email package repairs that as a
> matter of course.

Hmm, this ties into what I'm going to say below...

> I don't think we're *trying* to change anything.  But the parsing tools
> we use do rewrite things, according to the relevant standards'
> recommendations, in semantically neutral ways.

I know Spambayes does a lot as it is a rather complete filtering system, 
but one recommendation I would have to help the UNIX/procmail/scripting 
applications would be to (maybe via an extra sb_filter.py option) tell 
spambayes to do _minimal_ work -- just output a score, instead of operating 
on files or modifying and outputing entire messages with headers added.

spamprobe (another wonderful Bayesian filter) takes this approach and there 
are advantages for modularity. Acquiring a spam-state score need not modify 
the message in any way. Having the score stored in a variable also proves 
to be more fault-tolerant in the longrun; even if the filter fails somehow, 
perhaps due to misconfiguration, you just have a blank score instead of an 
entire missing email!

So although modifying the Received headers isn't wrong, I just can't help 
but feel it's unnecessary. It would be nice to be able to output just a 
score, without modifying the message.