[Spambayes-checkins] spambayes/Outlook2000 msgstore.py,1.57,1.58

Mark Hammond mhammond at users.sourceforge.net
Mon Jul 28 18:34:52 EDT 2003


Update of /cvsroot/spambayes/spambayes/Outlook2000
In directory sc8-pr-cvs1:/tmp/cvs-serv6514

Modified Files:
	msgstore.py 
Log Message:
Change the way we detect 'unsent' items - this way catches both unsent
items, and copies of sent items.  The latter is important when you filter
folders other than the inbox, and you have outlook to keep copies of
sent items in the current folder rather than in "Sent Items".  Previously,
these items were trained on, inadequate headers and all.  Note that mails
sent by you and actually received back *are* still filtered.

(If we go the "timer" approach, the it will be far more necessary to watch
folders that are the target of Outlook's builtin rules)


Index: msgstore.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Outlook2000/msgstore.py,v
retrieving revision 1.57
retrieving revision 1.58
diff -C2 -d -r1.57 -r1.58
*** msgstore.py	27 Jul 2003 23:46:48 -0000	1.57
--- msgstore.py	29 Jul 2003 00:34:50 -0000	1.58
***************
*** 497,501 ****
                            PR_PARENT_ENTRYID, # folder ID
                            PR_MESSAGE_CLASS_A, # 'IPM.Note' etc
!                           PR_MESSAGE_FLAGS, #unsent, from_me
                            ) 
  
--- 497,501 ----
                            PR_PARENT_ENTRYID, # folder ID
                            PR_MESSAGE_CLASS_A, # 'IPM.Note' etc
!                           PR_RECEIVED_BY_ENTRYID, # who received it
                            ) 
  
***************
*** 510,514 ****
          tag, parent_eid = prop_row[3]
          tag, msgclass = prop_row[4]
!         tag, flags = prop_row[5]
  
          self.id = store_eid, eid
--- 510,514 ----
          tag, parent_eid = prop_row[3]
          tag, msgclass = prop_row[4]
!         recby_tag, recby = prop_row[5]
  
          self.id = store_eid, eid
***************
*** 522,526 ****
          # Thus, searchkey is our long-lived message key.
          self.searchkey = searchkey
!         self.is_unsent = flags & MSGFLAG_UNSENT
          self.dirty = False
  
--- 522,537 ----
          # Thus, searchkey is our long-lived message key.
          self.searchkey = searchkey
!         # To check if a message has ever been received, we check the
!         # PR_RECEIVED_BY_ENTRYID flag.  Tim wrote in an old comment that
!         # An article on the web said the distinction can't be made with 100%
!         # certainty, but that a good heuristic is to believe that a
!         # msg has been received iff at least one of these properties
!         # has a sensible value: RECEIVED_BY_EMAIL_ADDRESS, RECEIVED_BY_NAME,
!         # RECEIVED_BY_ENTRYID PR_TRANSPORT_MESSAGE_HEADERS
!         # But MarkH can't find it, and believes and tests that
!         # PR_RECEIVED_BY_ENTRYID is all we need.
!         # This also means we don't need to check the 'unsent' flag - unsent
!         # messages never have the PR_RECEIVED_ properties either.
!         self.was_received = PROP_TYPE(recby_tag) == PT_BINARY
          self.dirty = False
  
***************
*** 559,579 ****
          # We don't attempt to filter:
          # * Non-mail items
!         # * Messages that have never been sent (ie, user-composed)
!         
!         # Note:  While we handle messages that have never been sent,
!         # we dont handle messages that were sent and moved from the
!         # Sent Items folder. It would be good not to train on them,
!         # since they are simply not received email.  An article on
!         # the web said the distinction can't be made with 100%
!         # certainty, but that a good heuristic is to believe that a
!         # msg has been received iff at least one of these properties
!         # has a sensible value:
!         #     PR_RECEIVED_BY_EMAIL_ADDRESS
!         #     PR_RECEIVED_BY_NAME
!         #     PR_RECEIVED_BY_ENTRYID
!         #     PR_TRANSPORT_MESSAGE_HEADERS
! 
          return self.msgclass.lower().startswith("ipm.note") and \
!                (not self.is_unsent or test_suite_running)
  
      def _GetPotentiallyLargeStringProp(self, prop_id, row):
--- 570,579 ----
          # We don't attempt to filter:
          # * Non-mail items
!         # * Messages that weren't actually received - this generally means user
!         #   composed messages yet to be sent, or copies of "sent items".
!         # It does *not* exclude messages that were user composed, but still
!         # actually received by the user (ie, when you mail yourself)
          return self.msgclass.lower().startswith("ipm.note") and \
!                (self.was_received or test_suite_running)
  
      def _GetPotentiallyLargeStringProp(self, prop_id, row):





More information about the Spambayes-checkins mailing list