[spambayes-dev] RE: [Spambayes] Re: Training empty messagesproblem

Fri Dec 17 00:29:12 CET 2004

> From: Kenny Pitt
> Sent: Thursday, December 16, 2004 4:21 PM

<...>

> That's all well and good, and I definitely appreciate the additional info,
> but we're not really concerned with interoperability here so all this may
> be overkill.  The sole purpose here is to provide SpamBayes with
> something that it can recognize and produce reasonable tokens from when
> asked to process an Exchange message received from another local Exchange
> user.

If that's the goal, no argument.  Microsoft will hopefully not control the
MUA market forever, so I was just looking to the future.  I am hoping that
either OpenOffice will one day produce an Outlook look-alike or that
Thunderbird will mature into a fully featured product competitive with
Outlook.  The Outlook plug-in is such a time-saver that it alone will keep
me on Outlook for a long time to come :)

<...>

> > My suggestion is that, of that whole series
> > of headers, the ones that would be of interest to Spambayes are:
> >
> > Resent-From:
> > Resent-Sender:
> > Resent-To:
> > Resent-cc:
>
> There are some differences between what non-Outlook versions of SpamBayes
> such as sb_server, sb_filter, and sb_imapfilter will see and what the
> Outlook addin will see because of the way Outlook destroys the original
> structure of the message.  However, one thing that *is* preserved is the
> original headers of a message received via SMTP, so these headers
> should be included if they were part of the original message.
>
> By default, SpamBayes ignores these headers.  There are options that you
> can tweak in the config file if you want them processed, though.  I
> believe the Tokenizer:safe_headers option is where you would do this,
> but I've never used it myself so I'm not 100% certain.

I looked at
file:///c:/Program%20Files/SpamBayes/docs/outlook/docs/configuration.html
and it doesn't give any Tokenizer options at all, though they obviously
exist.  The directions also state that there are no experimental options in
this release.  Where else would I look to find a description of the
supported configuration options?

One of my email accounts also has a special header for Brightmail detected
spam that would be helpful to tokenize.  This is not the Brightmail tracker
header itself, but one that my ISP adds.  This header is always the same
text and is as follows:

X-TDS-Spam: Potential Spam

Is there any support for tokenizing special headers like this?  Could I
manually add this to the list of safe headers to tokenize with that option?

<...>

> I don't remember mentioning anything about comments presented by
> Outlook in the address string.  The comments came from Tony's first pass
> at simulating the address headers for an Exchange e-mail address,

Sorry if I misinterpreted this.  I thought that Outlook passed you an
address string that contained some of the information between parentheses.

--

Seth Goodman