[Spambayes] Need testers!

Tim Peters tim.one@comcast.net
Mon, 16 Sep 2002 22:33:10 -0400


[Neale Pickett]
> Okay, timcv on a well-balanced set of spam n' ham produces numbers which
> to me look more reasonable.  Since my set consists exclusively of mail
> I've received, my dataset is a bit small.

Excellent report, Neale!  A later msg said you found a number of
misclassified msgs.  For purposes of deciding whether to make

> [Classifier]
> adjust_probs_by_evidence_mass: True
> min_spamprob: 0.001
> max_spamprob: 0.999
> hambias: 1.5

the default, it would help if you did this all over again.  It was a pure
and significant win for both error rates in this test report, but that may
change after msgs are shuffled between the spam and ham sets.

> If I did this all right, I think I'd like to check in a script to chain
> all these actions together, so knuckleheads like me don't go running
> useless tests again.

I agree that would be helpful (and, yes, it was all impeccably right).  I
have a collection of Windows .bat files to chain things together, but didn't
want to insult you Unixoids by checking them in <wink>.