[spambayes-dev] University Project

T. Alexander Popiel popiel at wolfskeep.com
Wed Apr 5 22:03:15 CEST 2006

In message:  <BAY108-F395318AE6453102EE206418ECB0 at phx.gbl>
             "Michael Harris" <m1keharris at hotmail.com> writes:
>I am trying to right a program for my 3rd year project at university which 
>will attempt to optimise the best configuration of spambayes for a user 
>given his spam/ham corpus.
>I was just wandering if anyone could tell me firstly how i can train 
>spambayes through command line on mbox's.  And secondly i need a way of 
>querying spambayes on the probability which it would assign to an individual 
>email of 'spaminess' given the training it has received.

For command-line operations of spambayes, look at sb_filter; it
provides the ability to train, untrain, and evaluate messages.

However, for large-scale evaluation of training strategies, I suggest
reading through TESTING.txt and testtools/{timcv.py,table.py,*.txt}.
Doing it all through the command line will be much, much slower than
using one of the test scripts that embeds a classifier and iterates
over all the messages for training and testing (without paying the
startup costs for each operation).

- Alex

