[spambayes-dev] Enhanced Outlook statistics display
Tony Meyer
tameyer at ihug.co.nz
Thu Dec 16 08:43:07 CET 2004
[Tim Peters]
> I'm sure we mentioned that paper here in the early days;
I'll believe you. It may have been prior to the separate mailing list (I
only read that far back), or maybe I just don't recall it - when I did read
the archives, I wasn't really that interested in reading stuff like that.
> and note that
> Gary Robinson's oft-noted site has referred to it too approximately
> forever, although via a different link:
>
> http://arxiv.org/abs/cs.CL/0006013
Yes, but *after* the maths, when everyone has stopped reading <wink>.
> I knew the paper, and the choice to model costs in SpamBayes testing
> in terms of hypothetical dollars charged to instances of different
> kinds of errors was deliberate, but there's really no connection
> between those. "Dollars and cents" models are simply intuitively
> appealing to people regardless of statistical background, and I didn't
> want the volunteer testers on this project to feel put out by a
> measure that seemed esoteric.
Since I've got you talking <wink>, what was the basis behind the
$10,$1,0.20c choices? Just numbers that seemed right, or something more
concrete?
> If you're going to provide a single figure of merit, there are
> constraints pushing in this direction. The choice of a linear model
> is convenient and arguably a good first-order (literally)
> approximation to a realistic cost model.
I think the people asking for more statistics (in the GUI) are probably
after a single figure - something to wave in front of people that ask about
the accuracy. Straight lines are easier to draw, too, if we ever do provide
the requested graphical representation <0.5 wink>.
[Tony Meyer]
>> (I wish I had found this when I was writing my CEAS paper earlier
>> in the year).
[Tim Peters]
> Then you should have asked <wink>.
Well, it didn't really make that much difference, but it would have saved me
trying to explain the idea. That's what I get for (co)writing a paper in an
area I have very little background in without the luxury of time to do
better background reading. If I do wade outside my 'proper' research area
again, I'll at least be better prepared (and maybe I will ask :).
=Tony.Meyer
More information about the spambayes-dev
mailing list