[spambayes-dev] RE: [Spambayes] Re: Training empty messages problem

Kenny Pitt kennypitt at hotmail.com
Mon Dec 13 20:16:32 CET 2004


Tony Meyer wrote:
>> When I look in my Junk mail folder those empty spam still have a spam
>> probability of below 50%, partly caused by those "message-id:invalid"
>> headers (sorry I didn't pick that up sooner). I looked at some
>> Exchange mails in my inbox and all those have invalid message-id. So
>> just like from:none, all of my internal Exchange mails have
>> "message-id:invalid".
> 
> Interesting - I wouldn't have thought they would have any message-id.
> Could you pick a random Exchange (good) mail, and send me a copy of
> the message id header for that message?  Maybe there's an Exchange
> format for the things that we can leverage.

I get the same message-id behavior from our Exchange server.  Here's the
complete set of headers from a recent mail as SpamBayes sees them in Show
Clues.

"""
X-Exchange-Message: true
Subject: A recent Exchange message
From: Joe Smith
To: All Employees
X-Exchange-Delivery-Time: Mon, 13 Dec 2004 13:52:22 -0500
"""

This was taken using latest CVS.  Names have been changed to protect the
innocent, but otherwise the headers are completely intact.  Notice that
there is no message id header of any sort, and that the From and To fields
do not use Internet standard address format.  The following tokens were
included among the clues, and are typical for most if not all of my Exchange
mail:

"""
token                               spamprob         #ham  #spam
'message-id:invalid'                0.214766           19      9
'x-mailer:none'                     0.622068           88    258
'from:no real name:2**0'            0.642539           29     93
"""

Maybe there's a property in the Outlook message object somewhere that we
need to retrieve and add to the headers when we reconstruct the message?

-- 
Kenny Pitt



More information about the spambayes-dev mailing list