[Spambayes] RE: Further Improvement 2

Tim Peters tim.one@comcast.net
Mon, 23 Sep 2002 13:52:15 -0400

[ Skip Montanaro]
> We've been doing a lot of manual knob turning trying to find the best
> settings for various options.  It seems like we need an
> optimization tool of some sort to automate this process.  I know by
> suggesting it, I'm implicitly volunteering to create it, but I don't
> have the time to do this.

It's difficult, as the effects of various option settings are generally not
independent.  I've been using what amounts to a "deepest descent" strategy
by hand:  fiddle a whole bunch of parameters, one a time, on a fixed and
small test corpus.  Whichever one of those gave the biggest improvement then
gets loving and exceedingly wall-clock-time consuming attention on my full
test suite.  Go back to square one.  On the second iteration of this, you
discover that the first thing you tuned is no longer optimally tuned after
you tune the second thing you found.  The effects of various options are
generally not independent of corpus size either.  It's fun <wink>!