[Spambayes-checkins] spambayes timtoken.py,1.6,1.7

Guido van Rossum gvanrossum@users.sourceforge.net
Fri, 06 Sep 2002 21:50:12 -0700


Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv9264

Modified Files:
	timtoken.py 
Log Message:
Made tokenize() polymorphic.  It now accepts an email.Message.Message
instance, a file-like object (something with a readline method), or a
string (anything else).  This is a major speed boost for hammie.py,
which has Message objects, but had to convert them to strings before
passing to tokenize(), which parsed the string into a Message object
again!


Index: timtoken.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/timtoken.py,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** timtoken.py	7 Sep 2002 01:41:28 -0000	1.6
--- timtoken.py	7 Sep 2002 04:50:10 -0000	1.7
***************
*** 2,6 ****
  
  import email
- from email import message_from_string
  
  from sets import Set
--- 2,5 ----
***************
*** 555,566 ****
              yield 'content-transfer-encoding:' + x.lower()
  
! def tokenize(string):
      # Create an email Message object.
!     try:
!         msg = message_from_string(string)
!     except email.Errors.MessageParseError:
!         yield 'control: MessageParseError'
!         # XXX Fall back to the raw body text?
!         return
  
      # Special tagging of header lines.
--- 554,570 ----
              yield 'content-transfer-encoding:' + x.lower()
  
! def tokenize(obj):
      # Create an email Message object.
!     if isinstance(obj, email.Message.Message):
!         msg = obj
!     elif hasattr(obj, "readline"):
!         msg = email.message_from_file(obj)
!     else:
!         try:
!             msg = email.message_from_string(obj)
!         except email.Errors.MessageParseError:
!             yield 'control: MessageParseError'
!             # XXX Fall back to the raw body text?
!             return
  
      # Special tagging of header lines.