[Spambayes] 1.0rc1, modifies Received headers
lists03 at pc9.org
Sun May 30 03:34:56 EDT 2004
> and the way the long Received line was broken up conforms to the
> standard. The reason Python's email package breaks it is that RFC 2822
> says (in part):
> There are two limits that this standard places on the number of
> characters in a line. Each line of characters MUST be no more than
> 998 characters, and SHOULD be no more than 78 characters, excluding
> the CRLF.
OK, it appears to not be a problem them. But since Postfix wrote the
headers originally, I was used to seeing them like that.
> Whoever produced the long Received line originally wasn't following the
> standard's recommendation, and Python's email package repairs that as a
> matter of course.
Hmm, this ties into what I'm going to say below...
> I don't think we're *trying* to change anything. But the parsing tools
> we use do rewrite things, according to the relevant standards'
> recommendations, in semantically neutral ways.
I know Spambayes does a lot as it is a rather complete filtering system,
but one recommendation I would have to help the UNIX/procmail/scripting
applications would be to (maybe via an extra sb_filter.py option) tell
spambayes to do _minimal_ work -- just output a score, instead of operating
on files or modifying and outputing entire messages with headers added.
spamprobe (another wonderful Bayesian filter) takes this approach and there
are advantages for modularity. Acquiring a spam-state score need not modify
the message in any way. Having the score stored in a variable also proves
to be more fault-tolerant in the longrun; even if the filter fails somehow,
perhaps due to misconfiguration, you just have a blank score instead of an
entire missing email!
So although modifying the Received headers isn't wrong, I just can't help
but feel it's unnecessary. It would be nice to be able to output just a
score, without modifying the message.
More information about the Spambayes