[spambayes-dev] Enhanced Outlook statistics display

Tony Meyer tameyer at ihug.co.nz
Thu Dec 16 08:43:07 CET 2004


[Tim Peters]
> I'm sure we mentioned that paper here in the early days;

I'll believe you.  It may have been prior to the separate mailing list (I
only read that far back), or maybe I just don't recall it - when I did read
the archives, I wasn't really that interested in reading stuff like that.

> and note that
> Gary Robinson's oft-noted site has referred to it too approximately
> forever, although via a different link:
> 
>    http://arxiv.org/abs/cs.CL/0006013

Yes, but *after* the maths, when everyone has stopped reading <wink>.

> I knew the paper, and the choice to model costs in SpamBayes testing
> in terms of hypothetical dollars charged to instances of different
> kinds of errors was deliberate, but there's really no connection
> between those.  "Dollars and cents" models are simply intuitively
> appealing to people regardless of statistical background, and I didn't
> want the volunteer testers on this project to feel put out by a
> measure that seemed esoteric.

Since I've got you talking <wink>, what was the basis behind the
$10,$1,0.20c choices?  Just numbers that seemed right, or something more
concrete?

> If you're going to provide a single figure of merit, there are
> constraints pushing in this direction.  The choice of a linear model
> is convenient and arguably a good first-order (literally)
> approximation to a realistic cost model.

I think the people asking for more statistics (in the GUI) are probably
after a single figure - something to wave in front of people that ask about
the accuracy.  Straight lines are easier to draw, too, if we ever do provide
the requested graphical representation <0.5 wink>.

[Tony Meyer]
>> (I wish I had found this when I was writing my CEAS paper earlier
>> in the year).

[Tim Peters]
> Then you should have asked <wink>.

Well, it didn't really make that much difference, but it would have saved me
trying to explain the idea.  That's what I get for (co)writing a paper in an
area I have very little background in without the luxury of time to do
better background reading.  If I do wade outside my 'proper' research area
again, I'll at least be better prepared (and maybe I will ask :).

=Tony.Meyer



More information about the spambayes-dev mailing list