[Spambayes-checkins] spambayes/Outlook2000 train.py,1.12,1.13

Mark Hammond mhammond@users.sourceforge.net
Mon Nov 4 01:12:56 2002


Update of /cvsroot/spambayes/spambayes/Outlook2000
In directory usw-pr-cvs1:/tmp/cvs-serv2046

Modified Files:
	train.py 
Log Message:
Fix the root of my:
  File "F:\src\spambayes\classifier.py", line 450, in _getclues
    distance = abs(prob - 0.5)

Exception - problem is that we trained, but didn't update probabilities -
thus, we failed for every new word seen only since the last complete
retrain.

There may be a case for _getclues() to detect a probability of None
and call update_probabilities() automatically - either that or just
keep throwing vague exceptions <wink>



Index: train.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Outlook2000/train.py,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -d -r1.12 -r1.13
*** train.py	31 Oct 2002 22:03:35 -0000	1.12
--- train.py	4 Nov 2002 01:12:53 -0000	1.13
***************
*** 19,23 ****
      return spam == True
  
! def train_message(msg, is_spam, mgr):
      # Train an individual message.
      # Returns True if newly added (message will be correctly
--- 19,23 ----
      return spam == True
  
! def train_message(msg, is_spam, mgr, update_probs = True):
      # Train an individual message.
      # Returns True if newly added (message will be correctly
***************
*** 41,44 ****
--- 41,47 ----
      mgr.bayes.learn(tokens, is_spam, False)
      mgr.message_db[msg.searchkey] = is_spam
+     if update_probs:
+         mgr.bayes.update_probabilities()
+ 
      mgr.bayes_dirty = True
      return True
***************
*** 51,55 ****
          progress.tick()
          try:
!             if train_message(message, isspam, mgr):
                  num_added += 1
          except:
--- 54,58 ----
          progress.tick()
          try:
!             if train_message(message, isspam, mgr, False):
                  num_added += 1
          except: