[spambayes-dev] Any updated information for the plans
tomakebetter statistics?
Erik Brown
kirebrow at yahoo.com
Sun Oct 17 22:24:58 CEST 2004
> I'm not sure about adding an "accuracy percentage", since it's hard to
> define (what do you do with unsures?), and so might mislead people, since
> there presumably wouldn't be room to explain it in the dialog. You're
> welcome to try and convince me otherwise, of course; it would certainly be
> an easy addition.
I originally wanted an accuracy percentage that was something like
POPFile's. It may just be me, but seeing the percentage number (99.1)
increase (or decrease) over time is just cool. = )
I think the way POPFile's stats currently works is that you have to
reclassify "unclassified" mail before the stats take a hit. If you see any
validation in including this number, all manually re-classified email would
count as an error (in any folder: ham, spam, unsure). This way, if you get
a message in the unsure folder, and it is a message that you would not want
to train that would possibly taint the corpus, the accuracy percentage won't
take a hit in these rare occasions.
However, if this sounds like a bad idea, I can always subtract the unsure
percentage from 100, then somehow figure in the false positives and false
negatives if any. Would they be counted as unsure after you re-classify
them btw?
Erik brown
More information about the spambayes-dev
mailing list