[Spambayes] dumb question - why not simply the subject?

Tim Peters tim.one@comcast.net
Sat, 28 Sep 2002 13:55:09 -0400


[Skip Montanaro]
> If humans can differentiate ham and spam in the blink of an eye with
> near 100% accuracy looking at just the subject of a message, why
> should we need to feed any other content to a decent spam detector?

If it's true that humans can do this, we shouldn't need to.  What evidence
do you have for believing the antecedent, though?  I'll bet a dollar that
our classifier today would blow humans out of the water on both error rates
if such an experiment were to be conducted.

First we need to decide how much time we're going to give people for each
msg.

    http://www.improb.com/news/2000/aug2000/impe-twinkle-2000-08.html
    Hakkanen, Summala, et al. found a typical 8.23 millisecond blink
    duration in sleepy bus drivers, and a typical 5.19 millisecond
    blink duration in non-sleepy non-bus drivers.

Even at the slothful sleepy bus driver rate, we'll have to hold our human
subjects to classifying at least 120 messages per second <wink>.