[Spambayes-checkins] spambayes tokenizer.py,1.46,1.47

Tim Peters tim_one@users.sourceforge.net
Mon, 21 Oct 2002 18:37:55 -0700


Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv11217

Modified Files:
	tokenizer.py 
Log Message:
Replace   with a blank.  & doesn't appear to show up often enough
to bother with.


Index: tokenizer.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/tokenizer.py,v
retrieving revision 1.46
retrieving revision 1.47
diff -C2 -d -r1.46 -r1.47
*** tokenizer.py	30 Sep 2002 21:56:27 -0000	1.46
--- tokenizer.py	22 Oct 2002 01:37:53 -0000	1.47
***************
*** 1083,1089 ****
                  yield t
  
!             # Remove HTML/XML tags.
              if (part.get_content_type() == "text/plain" or
                      not options.retain_pure_html_tags):
                  text = html_re.sub(' ', text)
  
--- 1083,1090 ----
                  yield t
  
!             # Remove HTML/XML tags.  Also  .
              if (part.get_content_type() == "text/plain" or
                      not options.retain_pure_html_tags):
+                 text = text.replace(' ', ' ')
                  text = html_re.sub(' ', text)