[Spambayes] Better optimization loop

Rob W.W. Hooft rob@hooft.net
Wed Nov 20 12:17:01 2002


Tim Peters wrote:
> [Tim]

>>>Good observation!  That should help.  simplex isn't fast in the best of
>>>cases, and in this case ...

> [Rob Hooft]

>>Anyone that has a faster optimization algorithm lying around is welcome
>>to replace my Simplex code.

[Tim]

> Twasn't a criticism, just an observation about downhill Simplex, in anyone's
> implementation.  Multidimensional optimization is a darned hard problem, and
> this approach is at least pretty robust.

It wasn't anger, it was a genuine invitation.... ;-) I'm running these 
tests, and they are taking daaaayysss, so I really welcome anyone that 
has alternatives.

One alternative I thought of is to keep the wordcounts lying around, and 
only calling update function once before starting scoring. But I'm not 
sure I would be the best person to try that (read: I'm sure someone else 
can do that 10x faster than I can).

Another speedup I could use is a version of Bayes that calculates the 
spamprob from the numbers on demand instead of calculating them for all 
words everytime. This pays of for all cases where the training batch is 
very small (~1 message).

Rob

-- 
Rob W.W. Hooft  ||  rob@hooft.net  ||  http://www.hooft.net/people/rob/




More information about the Spambayes mailing list