[spambayes-dev] Any updated information for the plans tomakebetter statistics?

Erik Brown kirebrow at yahoo.com
Sun Oct 17 22:24:58 CEST 2004


> I'm not sure about adding an "accuracy percentage", since it's hard to
> define (what do you do with unsures?), and so might mislead people, since
> there presumably wouldn't be room to explain it in the dialog.  You're
> welcome to try and convince me otherwise, of course; it would certainly be
> an easy addition.

I originally wanted an accuracy percentage that was something like
POPFile's.  It may just be me, but seeing the percentage number (99.1)
increase (or decrease) over time is just cool. = )

I think the way POPFile's stats currently works is that you have to
reclassify "unclassified" mail before the stats take a hit.  If you see any
validation in including this number, all manually re-classified email would
count as an error (in any folder: ham, spam, unsure).  This way, if you get
a message in the unsure folder, and it is a message that you would not want
to train that would possibly taint the corpus, the accuracy percentage won't
take a hit in these rare occasions.

However, if this sounds like a bad idea, I can always subtract the unsure
percentage from 100, then somehow figure in the false positives and false
negatives if any.  Would they be counted as unsure after you re-classify
them btw?

Erik brown 




More information about the spambayes-dev mailing list