[Spambayes-checkins] spambayes timtoken.py,1.6,1.7
Guido van Rossum
gvanrossum@users.sourceforge.net
Fri, 06 Sep 2002 21:50:12 -0700
Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv9264
Modified Files:
timtoken.py
Log Message:
Made tokenize() polymorphic. It now accepts an email.Message.Message
instance, a file-like object (something with a readline method), or a
string (anything else). This is a major speed boost for hammie.py,
which has Message objects, but had to convert them to strings before
passing to tokenize(), which parsed the string into a Message object
again!
Index: timtoken.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/timtoken.py,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** timtoken.py 7 Sep 2002 01:41:28 -0000 1.6
--- timtoken.py 7 Sep 2002 04:50:10 -0000 1.7
***************
*** 2,6 ****
import email
- from email import message_from_string
from sets import Set
--- 2,5 ----
***************
*** 555,566 ****
yield 'content-transfer-encoding:' + x.lower()
! def tokenize(string):
# Create an email Message object.
! try:
! msg = message_from_string(string)
! except email.Errors.MessageParseError:
! yield 'control: MessageParseError'
! # XXX Fall back to the raw body text?
! return
# Special tagging of header lines.
--- 554,570 ----
yield 'content-transfer-encoding:' + x.lower()
! def tokenize(obj):
# Create an email Message object.
! if isinstance(obj, email.Message.Message):
! msg = obj
! elif hasattr(obj, "readline"):
! msg = email.message_from_file(obj)
! else:
! try:
! msg = email.message_from_string(obj)
! except email.Errors.MessageParseError:
! yield 'control: MessageParseError'
! # XXX Fall back to the raw body text?
! return
# Special tagging of header lines.