[Python-checkins] python/nondist/sandbox/spambayes classifier.py,1.10,1.11
tim_one@users.sourceforge.net
tim_one@users.sourceforge.net
Wed, 04 Sep 2002 22:37:14 -0700
Update of /cvsroot/python/python/nondist/sandbox/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv13543
Modified Files:
classifier.py
Log Message:
Added note about MINCOUNT oddities.
Index: classifier.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/spambayes/classifier.py,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** classifier.py 5 Sep 2002 01:51:18 -0000 1.10
--- classifier.py 5 Sep 2002 05:37:12 -0000 1.11
***************
*** 57,60 ****
--- 57,70 ----
# (In addition, the count compared is after multiplying it with the
# appropriate bias factor.)
+ #
+ # XXX Reducing this to 1.0 (effectively not using it at all then) seemed to
+ # XXX give a sharp reduction in the f-n rate in a partial test run, while
+ # XXX adding a few mysterious f-ps. Then boosting it to 2.0 appeared to
+ # XXX give an increase in the f-n rate in a partial test run. This needs
+ # XXX deeper investigation. Might also be good to develop a more general
+ # XXX concept of confidence: MINCOUNT is a gross gimmick in that direction,
+ # XXX effectively saying we have no confidence in probabilities computed
+ # XXX from fewer than MINCOUNT instances, but unbounded confidence in
+ # XXX probabilities computed from at least MINCOUNT instances.
MINCOUNT = 5.0