[Spambayes] Re: Outlook plugin plus Exchange

Tim Peters tim.one@comcast.net
Tue Nov 12 06:25:02 2002


[Piers Haken]
> Yup, oulook displays it properly.

Meaning it shows you the HTML part, as rendered HTML, I bet.

> I have a feeling that it's oracle's mess,

Not from what you showed below.  It's not hard to find the end of the
headers!  The first blank line ends them.  That Outlook is showing you stuff
beyond that in its view of the headers says it didn't suck out the headers
properly to begin with.

> but that outlook just ignores the invalid MIME-part headers

By this point Outlook isn't looking *at all* at the part that's damaged (and
probably by it).  It's just sucking out the PR_BODY_HTML property from the
msg and rendering it, and the value of that property contains no MIME armor
at all, just HTML stuff.

> -- maybe spambayes can do the same.

I keep telling people never to call email.message_from_string() directly,
but they don't listen <wink>.  The tokenizer's way of getting an email
message from a string would have at least recovered the message body in this
case, but would have lost the headers entirely (they're crap -- what can you
do?).

> The problem is multiplied by the fact that outlook includes the MIME-
> part headers and boundaries with the regular headers,

The Outlook client actually deletes those from the headers, because:

> but separates the body parts and attachments. I don't think there's
> any way to get the original, unseparated message from the API.

That's right, there isn't.  Outlook's basic structure appears to predate
MIME catching on, and the MIME support very much appears hacked in after it
was too late for a change in worldview.  It's a mess that way, if you want
to (as we do) get MIME back out.  The Outlook client right now "loses" all
attachments, and even loses the msg body if the msg has been digitally
signed (because it turns out Outlook does Yet Another Entirely Different
Thing for signed msgs, leaving the two "normal" body properties empty and
stuffing the body *plus* the signature into Yet Another property).

> The Outlook UI shows the headers as:

By this do you mean View -> Options -> Internet headers?

<oracle-headers>
Microsoft Mail Internet Headers Version 2.0
Received: from inet-mail7.oracle.com ([209.246.10.171]) by
zeus.sfhq.friskit.com with Microsoft SMTPSVC(5.0.2195.4453);
         Sat, 13 Apr 2002 03:19:01 -0700
Received: from blaster-smtp.oracle.com (eblast01.oracleeblast.com
[148.87.9.11])
        by inet-mail7.oracle.com (Switch-2.2.1/Switch-2.2.0) with ESMTP id
g3DA8GV30065
        for PIERSH@FRISKIT.COM; Sat, 13 Apr 2002 03:08:16 -0700
Date: Sat, 13 Apr 2002 03:08:16 -0700
Message-Id: <200204131008.g3DA8GV30065@inet-mail7.oracle.com>
Subject: Oracle University iSeminars
From: Oracle Corporation<replies@oracleeblast.com>
To: PIERSH@FRISKIT.COM
Reply-To: replies@oracleeblast.com
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
Content-Type: multipart/alternative;
    boundary="next_part_of_message"
Return-Path: replies@oracleeblast.com
X-OriginalArrivalTime: 13 Apr 2002 10:19:01.0938 (UTC)
FILETIME=[A1D8F920:01C1E2D4]

--next_part_of_message
of_message
ge

--next_part_of_message
Content-Type: text/html

</oracle-headers>

There's no way blank lines can be part of the headers, so I don't believe
Oracle screwed this up.  They really are blank, too, as the traceback you
sent earlier showed this at the tail end of the headers:

\r\n
--next_part_of_message\r\n
of_message\r\n
ge\r\n
\r\n
--next_part_of_message\r\n
Content-Type: text/html\r\n
\r\n
\n

and *our* code put in the lone oddball \n after the end of what Outlook told
us were the original headers.  If that's common damage, I can worm around
it.




More information about the Spambayes mailing list