[Spambayes-checkins] spambayes runtest.sh,NONE,1.1
README.txt,1.18,1.19
Neale Pickett
rubiconx@users.sourceforge.net
Mon, 16 Sep 2002 21:49:18 -0700
Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv26378
Modified Files:
README.txt
Added Files:
runtest.sh
Log Message:
Added the runtest.sh script, which is supposed to make it easier for
rubes like myself to submit useful test results.
--- NEW FILE: runtest.sh ---
#! /bin/sh -x
##
## runtest.sh -- run some tests for Tim
##
## This does everything you need to test yer data. You may want to skip
## the rebal steps if you've recently moved some of your messages
## (because they were in the wrong corpus) or you may suffer my fate and
## get stuck forever re-categorizing email.
##
## Just set up your messages as detailed in README.txt; put them all in
## the reservoir directories, and this script will take care of the
## rest. Paste the output (also in results.txt) to the mailing list for
## good karma.
##
## Neale Pickett <neale@woozle.org>
##
# Number of messages per rebalanced set
RNUM=200
# Number of sets
SETS=5
# Put them all into reservoirs
python rebal.py -r Data/Ham/reservoir -s Data/Ham/Set -n 0 -Q
python rebal.py -r Data/Spam/reservoir -s Data/Spam/Set -n 0 -Q
# Rebalance
python rebal.py -r Data/Ham/reservoir -s Data/Ham/Set -n $RNUM -Q
python rebal.py -r Data/Spam/reservoir -s Data/Spam/Set -n $RNUM -Q
# Clear out .ini file
rm -f bayescustomize.ini
# Run 1
python timcv.py -n $SETS > run1.txt
# New .ini file
cat > bayescustomize.ini <<EOF
[Classifier]
adjust_probs_by_evidence_mass: True
min_spamprob: 0.001
max_spamprob: 0.999
hambias: 1.5
EOF
# Run 2
python timcv.py -n $SETS > run2.txt
# Generate rates
python rates.py run1 run2 > runrates.txt
# Compare rates
python cmp.py run1s run2s | tee results.txt
Index: README.txt
===================================================================
RCS file: /cvsroot/spambayes/spambayes/README.txt,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -d -r1.18 -r1.19
*** README.txt 14 Sep 2002 22:18:24 -0000 1.18
--- README.txt 17 Sep 2002 04:49:16 -0000 1.19
***************
*** 124,127 ****
--- 124,134 ----
the number of messages per folder.
+ runtest.sh
+ A bourne shell script (for Unix) which will run some test or other.
+ I (Neale) will try to keep this updated to test whatever Tim is
+ currently asking for. The idea is, if you have a standard directory
+ structure (below), you can run this thing, go have some tea while it
+ works, then paste the output to the spambayes list for good karma.
+
Standard Test Data Setup