[Spambayes-checkins] spambayes/Outlook2000 manager.py,1.14,1.15
Tim Peters
tim_one@users.sourceforge.net
Mon, 21 Oct 2002 11:55:32 -0700
Update of /cvsroot/spambayes/spambayes/Outlook2000
In directory usw-pr-cvs1:/tmp/cvs-serv4305
Modified Files:
manager.py
Log Message:
GetBayesStreamForMessage(): For every msg with MIME structure, Outlook
left the boundary info in the headers, but there are no boundaries in
the body. As a result, all of the body was invisible to the Python email
pkg. Reconstituting the full original email from Outlook appears to be
a real bitch -- maybe Mozilla has code for this we can use (but I suspect
its import-from-Outlook gimmick actually crawls over the .pst file; I
haven't used it, just read about it).
In the meantime, quick hack: squash the text part (if any) and the HTML
part (if any) together as one big text blob, and if the headers make any
claims about MIME type and/or transfer encoding, simply delete those
header lines.
Index: manager.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Outlook2000/manager.py,v
retrieving revision 1.14
retrieving revision 1.15
diff -C2 -d -r1.14 -r1.15
*** manager.py 20 Oct 2002 23:51:04 -0000 1.14
--- manager.py 21 Oct 2002 18:55:30 -0000 1.15
***************
*** 84,87 ****
--- 84,89 ----
def GetBayesStreamForMessage(self, message):
# Note - caller must catch COM error
+ import email
+
headers = message.Fields[0x7D001E].Value
headers = headers.encode('ascii', 'replace')
***************
*** 92,97 ****
body = ""
body += message.Text.encode("ascii", "replace")
! return headers + body
!
def LoadBayes(self):
--- 94,109 ----
body = ""
body += message.Text.encode("ascii", "replace")
!
! # XXX If this was originally a MIME msg, we're hosed at this point --
! # the boundary tag in the headers doesn't exist in the body, and
! # the msg is simply ill-formed. The miserable hack here simply
! # squashes the text part (if any) and the HTML part (if any) together,
! # and strips MIME info from the original headers.
! msg = email.message_from_string(headers + '\n' + body)
! if msg.has_key('content-type'):
! del msg['content-type']
! if msg.has_key('content-transfer-encoding'):
! del msg['content-transfer-encoding']
! return msg
def LoadBayes(self):