[Spambayes] Re: Outlook plugin plus Exchange

Tim Peters tim.one@comcast.net
Tue Nov 12 07:12:27 2002


[Mark Hammond]
> Something that confuses me completely here is:
>
> * Outlook shows headers with blank lines, appearing to royally
> screw things up.

Yes.

> * Out Outlook client simply appends the body(s) to the headers as a
> simple string.

Ditto.

> * We pass this re-constituted string back into the email package,
> and it too seems to screw up the header parsing!

Ditto again.  You're on a roll, Mark <wink>.

> ie, Outlook shows the headers as:
>
> """
> ...
> X-OriginalArrivalTime: 13 Apr 2002 10:19:01.0938 (UTC)
> FILETIME=[A1D8F920:01C1E2D4]
>
> --next_part_of_message
> of_message
> ...
> """
>
> And the traceback from the email package shows:
>
> "C:\Python22\spam\spambayes\email\Parser.py", line 105, in _parseheaders
>     raise Errors.HeaderParseError(
> HeaderParseError: Not a header, not a continuation: ``of_message''

This won't make sense to you just yet <wink>, but look at the full traceback
instead:

Traceback (most recent call last):
  File "C:\Python22\spam\spambayes\Outlook2000\train.py", line 67, in
train_folder
    if train_message(message, isspam, mgr):
  File "C:\Python22\spam\spambayes\Outlook2000\train.py", line 36, in
train_message
    stream = msg.GetEmailPackageObject()
  File "C:\Python22\spam\spambayes\Outlook2000\msgstore.py", line 431, in
GetEmailPackageObject
    msg = email.message_from_string(text)
  File "C:\Python22\spam\spambayes\email\__init__.py", line 39, in
message_from_string
    return Parser(_class, strict=strict).parsestr(s)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 52, in parsestr
    return self.parse(StringIO(text), headersonly=headersonly)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 48, in parse
    self._parsebody(root, fp)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 206, in _parsebody
    msgobj = self.parsestr(part)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 52, in parsestr
    return self.parse(StringIO(text), headersonly=headersonly)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 46, in parse
    self._parseheaders(root, fp)
  File "C:\Python22\spam\spambayes\email\Parser.py", line 105, in
_parseheaders
    raise Errors.HeaderParseError(
HeaderParseError: Not a header, not a continuation: ``of_message''

It's descending *into* the body when the error occurs, and at that point
it's really talking about the MIME-section headers, not the message headers,
starting with

> --next_part_of_message
> of_message

as a distinct section.

> Which seems very strange to me.  Why is the email package
> complaining about the "of_message" line, rather than itself stopping
> header parsing after that blank?

My guess is that it *did* stop after the first blank line, so far as the
*message* headers were concerned.  At this point it's looking at the headers
in the individual MIME sections.  I realize this still doesn't make sense to
you, but it will very soon <wink>:

> (Recall that the the email package does not see the "ContentType:"
> header, as we remove that before sending it in.)

That's what confused me at first too, but it isn't true here:  we don't
remove the Content-Type header until *after* email_message_from_string()
returns a message.  We never got that far in this case.

> I assume I am simply missing how messages are parsed.

Maybe, but it's irrelevant <wink>.  By the time I'm stripping the MIME
headers in the Outlook client, it's too late to do any good.  I don't know
how to better, though (with minor effort) -- it's really a job for Barry.
We've been saved so far because the email parser *is* lax by default, and
doesn't complain about missing MIME armor.  It does complain about MIME
armor that makes no sense, though, and I've never seen that happen in any of
my email.  If we managed to get Content-Type out of the Outlook headers
before calling message_from_string, there's no problem with this msg (I
tried that -- it works -- but I removed Content-Type by hand with an editor,
which isn't terribly scalable <wink>).




More information about the Spambayes mailing list