[Spambayes] Better optimization loop
Rob W.W. Hooft
rob@hooft.net
Wed Nov 20 12:17:01 2002
Tim Peters wrote:
> [Tim]
>>>Good observation! That should help. simplex isn't fast in the best of
>>>cases, and in this case ...
> [Rob Hooft]
>>Anyone that has a faster optimization algorithm lying around is welcome
>>to replace my Simplex code.
[Tim]
> Twasn't a criticism, just an observation about downhill Simplex, in anyone's
> implementation. Multidimensional optimization is a darned hard problem, and
> this approach is at least pretty robust.
It wasn't anger, it was a genuine invitation.... ;-) I'm running these
tests, and they are taking daaaayysss, so I really welcome anyone that
has alternatives.
One alternative I thought of is to keep the wordcounts lying around, and
only calling update function once before starting scoring. But I'm not
sure I would be the best person to try that (read: I'm sure someone else
can do that 10x faster than I can).
Another speedup I could use is a version of Bayes that calculates the
spamprob from the numbers on demand instead of calculating them for all
words everytime. This pays of for all cases where the training batch is
very small (~1 message).
Rob
--
Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/
More information about the Spambayes
mailing list