[Spambayes] how spambayes handles image-only spams

Meyer, Tony T.A.Meyer at massey.ac.nz
Thu Sep 11 22:55:55 EDT 2003


> What's your test protocol?  I did "shuffle messages randomly, 
> but preserve knowledge of which class they were in, then 
> train with the first 90% and then test with the last 10%".  
> Repeat as needed...

I did "rebal.py -n5", which IIRC is roughly equivalent to "shuffle
messages randomly, but preserve knowledge of which class they were in".
I then did "timtest.py -n5".

I'm happy to admit I understand little of what the testing code does,
just how to interpret (most of) the results that it gives me.  This is
one of the strengths of the spambayes testing suite, IMO (not that I
have tried any other testing suites).

The readme says that it does this:
"""
Runs an NxN test grid, skipping the diagonal:
    N classifiers are built.
    N-1 runs are done with each classifier.
    Each classifier is trained on 1 set, and predicts against each of
        the N-1 remaining sets (those not used to train the classifier).
"""

So in my case, I think this means that I train with the first 20%, then
test with each of the remaining 20%s (and repeat).  I may be wrong
<wink>.

=Tony Meyer



More information about the Spambayes mailing list