[Spambayes-checkins] spambayes runtest.sh,NONE,1.1 README.txt,1.18,1.19

Neale Pickett rubiconx@users.sourceforge.net
Mon, 16 Sep 2002 21:49:18 -0700


Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv26378

Modified Files:
	README.txt 
Added Files:
	runtest.sh 
Log Message:
Added the runtest.sh script, which is supposed to make it easier for
rubes like myself to submit useful test results.


--- NEW FILE: runtest.sh ---
#! /bin/sh -x
##
## runtest.sh -- run some tests for Tim
##
## This does everything you need to test yer data.  You may want to skip
## the rebal steps if you've recently moved some of your messages
## (because they were in the wrong corpus) or you may suffer my fate and
## get stuck forever re-categorizing email.
##
## Just set up your messages as detailed in README.txt; put them all in
## the reservoir directories, and this script will take care of the
## rest.  Paste the output (also in results.txt) to the mailing list for
## good karma.
##
## Neale Pickett <neale@woozle.org>
##

# Number of messages per rebalanced set
RNUM=200

# Number of sets
SETS=5

# Put them all into reservoirs
python rebal.py -r Data/Ham/reservoir -s Data/Ham/Set -n 0 -Q
python rebal.py -r Data/Spam/reservoir -s Data/Spam/Set -n 0 -Q
# Rebalance
python rebal.py -r Data/Ham/reservoir -s Data/Ham/Set -n $RNUM -Q
python rebal.py -r Data/Spam/reservoir -s Data/Spam/Set -n $RNUM -Q
# Clear out .ini file
rm -f bayescustomize.ini
# Run 1
python timcv.py -n $SETS > run1.txt
# New .ini file
cat > bayescustomize.ini <<EOF
[Classifier]
adjust_probs_by_evidence_mass: True
min_spamprob: 0.001
max_spamprob: 0.999
hambias: 1.5
EOF
# Run 2
python timcv.py -n $SETS > run2.txt
# Generate rates
python rates.py run1 run2 > runrates.txt
# Compare rates
python cmp.py run1s run2s | tee results.txt

Index: README.txt
===================================================================
RCS file: /cvsroot/spambayes/spambayes/README.txt,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -d -r1.18 -r1.19
*** README.txt	14 Sep 2002 22:18:24 -0000	1.18
--- README.txt	17 Sep 2002 04:49:16 -0000	1.19
***************
*** 124,127 ****
--- 124,134 ----
      the number of messages per folder.
  
+ runtest.sh
+     A bourne shell script (for Unix) which will run some test or other.
+     I (Neale) will try to keep this updated to test whatever Tim is
+     currently asking for.  The idea is, if you have a standard directory
+     structure (below), you can run this thing, go have some tea while it
+     works, then paste the output to the spambayes list for good karma.
+ 
  
  Standard Test Data Setup