RE: [Python-Dev] The first trustworthy <wink> GBayes results

3 Sep 2002


      [Neil Schemenauer]
...
I noticed that as well.  When the classifier goes wrong it goes badly
wrong and using different thresholds would not help.  It seems that
increasing the number of discriminators doesn't really help either.  Too
bad because otherwise you could flag those messages for human
classification.
I think it's worse than just that:  suppose any scheme says "OK, this is
spam, with probability 0.9995".  If it's reporting accurate probabilities,
then another way to read that claim is "On average, one time in 2000 this
message actually isn't spam".  In real life we have to accept that there's
no scheme with a 0% false positive rate-- not even human review --short of
the scheme that never calls anything spam.  Since deciding on the largest
acceptable false positive rate is far more a social than a technical issue,
a group of nerds will do anything rather than face it <wink>.

RE: [Python-Dev] The first trustworthy <wink> GBayes results

Tim Peters